apis for server admins: rest, extract, tsm oh my! - tableau … · 2020-01-06 · apis for server...
TRANSCRIPT
APIs for Server Admins: REST, Extract, TSM Oh My!
William Lang
Senior Software Engineer
Tableau
@willlang
# T C 1 8
Tom O’Neil
Senior Software Engineer
Tableau
Automation
Extensions
Embedded Analytics
Data Connectivity
Data Science
Tableau Platform
Inte
gra
tion
s
Enabling Integrations for Developers
Introduction to the Tableau Server REST API
User Provisioning
Automate User Provisioning
High turnover in user accountsConference: Provide accounts to each attendee
University: Must provision hundreds (or thousands) of new users each semester
Corporate merger: Add accounts for the new employees
Will use Python and Tableau Server client
Sign In to Tableau Server
tableau_auth = TSC.TableauAuth(username, password, site)
server_client = TSC.Server(“https://us-west-2b.online.tableau.com”)
server_client.auth.sign_in(tableau_auth)
Creating a User
user = UserItem(username, role)user = server_client.users.add(user)
if user.id is not None:server_client.users.update(user, password)print("User added.")
server_client.groups.add_user(group, user)
Creating Multiple Users
Pull user data from a file, database, identity management system, or other source
Iterate through list:Generate username (on-prem)
Execute the user creation code from previous slide
E-mail the new user a welcome message (on-prem)
How Do I Get Tableau Server Client?
Installing Python3Linux: https://docs.python-guide.org/en/latest/starting/install3/linux/
MacOS: https://brew.sh/
brew install python3
Windows: https://www.python.org/downloads/release/python-362/
Installing tableauserverclientpip3 install tableauserverclient
Datasource Refresh
Tasks are groupings of a datasource (or workbook) with a
schedule
Schedules defined when a task runs
But what happens when you need a task to update a datasource
dependent on another datasource?
How do we ensure that our dependent task runs?
Tasks
Running a Taskif task_id:
task = server.tasks.get_by_id(task_id)
if dependency_id:dependency = server.tasks.get_by_id(dependency_id)if dependency.updated_at > task.updated_at:
server.tasks.run(task)
else:…
What is REST?
Representational State Transfer. Not a helpful title.
Defines a set of rules for messages to be sent from a client to a server and vice versa
What makes something RESTful?
• Stateless
• Lightweight
• Implementation of client and server are independent of one
another
Migrating Workbooks
Migrating Workbooks: Projects
Imagine you have a staging and production site
Workbooks are stored in project folders
Need to ensure that there exists a project in both the staging and production server or site
How do we ensure we maintain ownership?
Impersonation
Sample Code (default site):
tableau_auth = TSC.TableauAuth(username, password)
server = TSC.Server('https://10ay.online.tableau.com')
server.auth.sign_in(tableau_auth)
Sample Code (specific site):
tableau_auth = TSC.TableauAuth(username, password, site_name)
server = TSC.Server('https://10ay.online.tableau.com')
server.auth.sign_in(tableau_auth)
Sample Code (impersonation):
tableau_auth = TSC.TableauAuth(username, password, site_name, user_id)
server = TSC.Server('https://my-tableau-server.myorg')
server.auth.sign_in(tableau_auth)
Migrating Workbookssource_workbooks, pagination = source_server.workbooks.get()
for workbook in source_workbooks:
#If project doesn't exist on destination, just upload to default projectfile_path = source_server.workbooks.download(workbook.id, temp_dir)
if workbook.project_id in project_map:workbook.project_id = project_map[workbook.project_id]
else:workbook.project_id = project_map['default']
dest_server.workbooks.publish(workbook, file_path, dest_server.PublishMode.Overwrite)
Embedded Credentials
Stripped Embedded User Credentials
After you republish the workbook, you can update the credentials usingUpdate Workbook Connection API
Get the connection id using Query Workbook Connection API
Subscriptions
Users can subscribe to Workbooks and receive daily emails with the updates to those workbooks
New server admin joins the team and needs same subscriptions
Use Tableau Server REST API to copy the subscriptions from one admin user to the new one
We’re going to do this the hard way, using Python without Tableau Server client
Copy Subscriptions
Copy Subscriptions: Get User IDheaders = {
'X-Tableau-Auth': auth_token,
‘Content-Type’: ’application/json’,
‘Accept': ’application/json’
}
url = ’https://us-west-2b.online.tableau.com/api/3.1/sites/' + site_id + \
'/users?filter=name:eq:' + uname
user_response = requests.get(url, headers=headers).json
Tableau Server REST Sign In
REQUESTPOST https://us-west-2b.online.tableau.com/api/3.1/auth/signin
RESPONSE200 (OK)
{"credentials": {
"name": "admin","password": "p@ssword","site": {
"contentUrl": "mysite"}
}}
{"credentials": {
"site": {"id": "9a8b7c6d5-e4f3-a2b1-c0d9-e8f7a6b5c4d","contentUrl": "mysite"
},"user": {
"id": "9f9e9d9c-8b8a-8f8e-7d7c-7b7a6f6d6e6d"},"token": "12ab34cd56ef78ab90cd12ef34ab56cd"
}}
Include the token as an HTTP header in all further requests:
X-Tableau-Auth: 12ab34cd56ef78ab90cd12ef34ab56cd
By default, token is good for 4 hours
Tableau Server REST Auth Header
Copy Subscriptions: Get User ID
headers = {
'X-Tableau-Auth': auth_token,
‘Content-Type’: ’application/json’,
‘Accept': ’application/json’
}
url = ’https://us-west-2b.online.tableau.com/api/3.1/sites/' + site_id + \
'/users?filter=name:eq:' + uname
user_response = requests.get(url, headers=headers).json
If user_response.status_code == requests.codes.ok:
#continue
REST URLs
https://us-west-2b.online.tableau.com/api/3.1/sites/
9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d/
users?filter=name:eq:adminuser
Copy Subscriptions: Duplication
#Set headers: token and content type
response = requests.get(‘https://us-west-2b.online.tableau.com/api/3.1/sites/’ + \
site_id + '/subscriptions', headers=headers).json
for subscription in response.subscriptions.subscription:
#Check that this subscription is for this particular user
if (subscription['user']['id']) != ref_user_id:
continue
#Build request JSON
Copy Subscriptions: Example Request JSON
{…"subscription": {
"subject": "Site Migration Dashboard","content": {
"id": "3cd4eed3-56c5-4253-974a-9e37660cef0b","type": "View"
},"schedule": {
"id": "7eeb09c4-da4e-47b7-b41c-62c68aff03a0"},"user": {
"id": "bdbe273e-c2d7-42db-a1f1-297ff7384e47"}
}…
}
Copy Subscriptions: Build Request JSON
request_data = {
'subscription’: {'subject': subscription['subject’],
'content’: { 'id': subscription['content']['id’],
'type': subscription['content']['type’] },
'schedule’: { 'id': subscription['schedule']['id’] },
'user’: { 'id': new_user_id }
}
}
request_json = json.dumps(request_data)
Copy Subscriptions: Duplication
response = requests.get(https://us-west-2b.online.tableau.com/api/3.1/sites/’ + \
site_id + '/subscriptions', headers=headers).json
for subscription in response.subscriptions.subscription:
#Check that this subscription is for this particular user
if (subscription['user']['id']) != ref_user_id:
continue
#Build request JSON
#Set headers: token and content type
response = requests.post(https://us-west-2b.online.tableau.com/api/3.1/sites/’ + \
site_id + '/subscriptions', headers=headers, data=request_json)
Copy Subscriptions: Duplication
#request to get the subscriptions
response = requests.get(server + '/api/' + version + '/sites/’ + \
site_id + '/subscriptions', headers=headers).json
#request to create the new subscription
response = requests.post(server + '/api/' + version + '/sites/’ + \site_id + \
'/subscriptions', headers=headers, data=request_json)
if response.status_code == 201:
print(‘Subscription copied.’)
Headers: Used for authentication, content type, and/or API versioning
Content type: Typically JSON (can also be XML)
Path: Server URL, (optionally) API version, resource(s), filters or sorting
Verbs: e.g. GET, POST, PUT, DELETE
Response codes: e.g. OK, created, no content, bad request, not found
Takeaways: HTTP
Webhooks!!!
Webhooks: Signin// make the request$response = $guzzle->post('/api/exp/auth/signin', [
'headers' => ['Content-Type' => 'application/json’,'Accept' => 'application/json’
],'json' => [
'credentials' => ['name' => $username,'password' => $password,'site' => [
'contentUrl' => $site]
]]
]);
Webhooks{"webhook": {"name": "My Webhook!","webhook-source": {"webhook-source-event-workbook-created": {}
},"webhook-destination": {"webhook-destination-http": {"method": "POST","url": "https://my-app.example.com/my-created-workbook-webhook"
}}
}}
Webhooks: Payload{"resource":"WORKBOOK","event-type":"WorkbookCreated","resource-name":"My Workbook","site-id":"8b2a95d8-52b9-40a4-8712-cd6da771bd1b","resource-id":"99"
}
Webhooks: Request// create our webhook$response = $guzzle->post(sprintf('/sites/%s/webhooks', $siteId), [
'json' => ['webhook' => [
'name' => $name,'webhook-source' => [
$event => new \stdClass(),],'webhook-destination' => [
'webhook-destination-http' => ['method' => 'POST’,'url' => $url
]]
]]
]);
Pruning
Pruning
Running out of disk space
Delete old, unused content
Don’t know Python? Don’t like Python? Don’t want to learn it?
We are going to use Java
Pruning
REST client and server implementations are completely independent
REST libraries exist for a variety of languages
We will use Spring’s RestTemplate Java library
Part of the spring-web package
Included in Spring Boot, which we will use to make our application runnable
Pruning
{"credentials": {"name": "admin","password": "p@ssword","site": {"contentUrl": "MySite"
}}
}
Remember exactly what that sign in JSON looks like? Probably Not.
Pruning: Credentials Class
@JsonIgnoreProperties(ignoreUnknown = true)@JsonInclude(Include.NON_NULL)@JsonTypeName(value = "credentials")@JsonTypeInfo(include = As.WRAPPER_OBJECT, use = Id.NAME)public class Credentials {
private String name;
private String password;
private String token;
private Site site;
private User user;
// getters and setters follow
// Get the credentials parametersCredentials requestCredentials = new Credentials();requestCredentials.setName(opts.getOptionValue("user"));requestCredentials.setPassword(opts.getOptionValue("password"));requestCredentials.setSite(new Site(opts.getOptionValue(”siteUrl")));
RestTemplate restTemplate = new RestTemplate();
// Make the POST requestCredentials responseCredentials = restTemplate.postForObject(”https://us-west-2b.online.tableau.com/api/3.1/auth/signin", requestCredentials, Credentials.class);
String token = responseCredentials.getToken();
Pruning: Login
Pruning: Get Workbooksint currentPage = 1;boolean hasMorePages = true;while (hasMorePages) {
// Get the current page of workbooksWorkbooksResponse workbooks = restTemplate.getForObject(”https://us-west-
2b.online.tableau.com/api/3.1/sites/” + getSiteId() + "/workbooks?pageNumber=" + currentPage+ "&sort=updatedAt:asc", WorkbooksResponse.class);
for (Workbook workbook: workbooks.getWorkbooks().getWorkbook()) {if ( ((new Date()).getTime() - workbook.getUpdatedAt().getTime()) >
(1000*60*60*24*maxAge)) {// Workbook is more than maxAge days oldworkbooksToDelete.add(workbook);
} else {System.out.println("Newest date reached.");hasMorePages = false;break;
}}
Pruning: Pagination
"pagination": {"pageNumber": "1","pageSize": "100","totalAvailable": "276"
}
Query responses are paginated and contain a pagination header:
Pruning: Pagination@JsonIgnoreProperties(ignoreUnknown = true)@JsonInclude(Include.NON_NULL)public class Pagination {
private int pageNumber;private int pageSize;private int totalAvailable;
Pruning: Paginate WorkbooksWhile (hasMorePages) {
// Get and process current workbook pageif (workbooks.getPagination().getTotalAvailable() >
(workbooks.getPagination().getPageSize() * currentPage) ) {// Still more pages to view, so increment the current pagecurrentPage++;
} else {// No more pages to view, so stop making querieshasMorePages = false;
}}
Pruning: Delete Workbooksfor (Workbook workbook: workbooksToDelete) {
System.out.println("Deleting workbook " + workbook.getName() + "...");restTemplate.delete(”https://us-west-2b.online.tableau.com/api/3.1/sites/" +
getSiteId() + "/workbooks/" + workbook.getId());}
Pruning
Process to work around caching:Retrieve all data
Process to find everything to delete
Delete entities
Why do we query all the results first before deleting?
Caching
Changing Content Owner
Changing Content Owner
Tableau Server users cannot be deleted if they still
own content
Need to delete a user who owns hundreds of workbooks
Get all of that user's workbooks and transfer ownership
to another user
Changing Content Owner: Get Userprivate function getUser(Client $guzzle, string $siteId, string $user) : array {
$response = $guzzle->get(sprintf('/api/3.1/sites/%s/users?filter=name:eq:%s', $siteId, $user));
// check to make sure we only found one$json = json_decode((string)$response->getBody(), true);if ($json['pagination']['totalAvailable'] != 1) {
throw new \Exception("Expecting exactly 1 user for user. Found " .$json['pagination']['totalAvailable']);
}
// store itreturn $json['users']['user'][0];
}
Changing Content Owner: JSON Response{
"pagination": {"pageNumber":"1","pageSize":"100","totalAvailable":"1"
},"users": {
"user": [{"id":"bcdc5fe9-c59c-11e8-9bbc-f23c9116ad45","name":"[email protected]","siteRole":"SiteAdministratorCreator","authSetting":"ServerDefault","lastLogin":"2018-10-01T17:04:44Z","externalAuthUserId":"...."
}]}
}
Changing Content Owner: Workbook Request
// let's get all the workbooks owned by oldOwner and update them to be owned by newOwner$response = $guzzle->get(sprintf('/api/3.1/sites/%s/users/%s/workbooks', $siteId, $oldOwner['id']));
// get the workbooks from the response$json = json_decode((string)$response->getBody(), true);foreach ($json['workbooks']['workbook'] as $workbook) {
$id = $workbook['id'];
// let's do the update now$response = $guzzle->put(sprintf('/api/3.1/sites/%s/workbooks/%s', $siteId, $id), [
'json' => ['workbook' => [
'owner' => ['id' => $newOwner['id’]
]]
]]);
}
Daily Digest
Automated Digest Using AWS Lambda
Automated daily e-mail that lists all workbooks with updates over prior 24 hours
Everyone is using “The Cloud” these days – we will too!
Use AWS Lambda to automate the process serverlessly
Lambdas run in the Amazon cloud on demand, or via schedule, without provisioning dedicated server resources
Will create a Lambda, written in Python using Tableau Server Client, that runs nightly
Automated Digest Using AWS Lambda
Create Lambda function:Use AWS Web UI
Navigate to Lambda Service
Create function
Author from scratch
Python 3.6 runtime
Specify or create IAM role
Automated Digest Using AWS Lambda
for workbook in all_workbooks:workbook_age = now - workbook.updated_atif workbook_age <= datetime.timedelta(1):
daily_digest_list.append([workbook.id, workbook.name, workbook.updated_at.__str__(), ''])
if pagination_item.page_number * pagination_item.page_size >=pagination_item.total_available:
break
def lambda_handler(event, context):
req_options = TSC.RequestOptions()req_options.sort.add(TSC.Sort(TSC.RequestOptions.Field.UpdatedAt, TSC.RequestOptions.Direction.Desc))all_workbooks, pagination_item = server_client.workbooks.get(req_options)
#Retrieve next page of workbooks from APIreq_options.page_number(req_options.pagenumber + 1)all_workbooks, pagination_item = server_client.workbooks.get(req_options)
Automated Digest Using AWS Lambda
Create CloudWatch event rule:
AWS Web UICloudWatch ServiceCreate Event RuleSpecify schedule:Fixed rate
Cron expression
Add TargetSelect Lambda function
Tagging Workbooks with the Document API
What is it?The document API provides a means to easily extract information from a Tableau Workbook file (twb)
Why is this awesome?No more manually editing XML! No more accidental XML syntax mistakes!
What kind of information?Dimensions, measures, workbook names, filters
What can I do with it?
Tag all workbooks that use a specific field
Report on workbooks that are using out of date data sources
Email owners of workbooks that use calculated fields that need to be updated
Tagging Workbooks: Downloading Workbooks
for workbook in all_workbooks:file_name = server.workbooks.download(workbook.id, '.cache/', no_extract=True)workbook.file_name = file_name
while total > len(all_workbooks):workbooks, pagination = server.workbooks.get(RequestOptions(pagenumber=page,
pagesize=page_size))total = pagination.total_available
for workbook in workbooks:all_workbooks.append(workbook)page += 1
Tagging Workbooks: Datasources and Fields
wb = Nonetry:
wb = Workbook(workbook.file_name)except FileNotFoundError:
continue
for datasource in wb.datasources:for count, ds_field in enumerate(datasource.fields.values()):
if field == ds_field.name:taggable_workbooks.append([workbook.id, workbook.name])workbook.tags.update([tag])server.workbooks.update(workbook)
from tableaudocumentapi import Workbookfrom tableaudocumentapi import Field
Checking Server Health
Sign In To TSM
// login to tsm$response = $guzzle->post('/api/1.0/login', [
'json' => ['authentication' => [
'name' => $username,'password' => $password
]]
]);
// guzzle client$guzzle = new Client([
'base_uri' => $server,'headers' => [
'Content-Type' => 'application/json’,'Accept' => 'application/json’
],'cookies' => true
]);
Get Nodes and Their Status// get the nodes$response = $guzzle->get('/api/1.0/nodes');$json = json_decode((string)$response->getBody(), true);$nodes = $json['clusterNodes'];
foreach ($nodes as $node) {// get more information on a node$response = $guzzle->get(sprintf('/api/1.0/status/nodes/%s', $node['id']));$json = json_decode((string)$response->getBody(), true);print_r($json);
}
Extract API
Extract API
Really easy to use
Lightweight, around ~1mb total!
Extract consists of tables and rows only
Generate a TDE or a hyper extract
Extract API: Create Your Extract
final Extract extract = new Extract(“tmp/output.hyper”);
Extract API: Table Definitionfinal TableDefinition employeeDef = new TableDefinition();employeeDef.addColumn("id", Type.INTEGER);employeeDef.addColumnWithCollation("name", Type.UNICODE_STRING, Collation.EN_US);employeeDef.addColumnWithCollation("position", Type.UNICODE_STRING, Collation.EN_US);employeeDef.addColumn("start_date", Type.DATE);extract.addTable("employees", employeeDef);
Extract API: Inserting Rowsfinal Table table = extract.openTable("employees");
Row r = new Row(employeeDef);r.setInteger(0, 1);r.setString(1, "John");r.setString(2, "Co-Founder");r.setDate(3, 2018, 01, 01);table.insert(r);
Resources
Postman: https://www.getpostman.com/
VSCode rest client: https://github.com/Huachao/vscode-
restclient/
Useful Tools
Tableau Server Client(Python):
https://github.com/tableau/server-client-pythonOnly officially supported client
Java(Spring RestTemplate):
https://spring.io/guides/gs/consuming-rest/
Guzzle Client: https://github.com/guzzle/guzzle
Tableau Rest API samples: https://github.com/tableau/rest-
api-samplesJava and Python
Useful Libraries
REST APIhttps://onlinehelp.tableau.com/current/api/rest_api/en-us/help.htm
Extract APIhttps://onlinehelp.tableau.com/current/api/sdk/en-us/help.htm#SDK/tableau_sdk.htm
TSMhttps://onlinehelp.tableau.com/v0.0/api/tsm_api/en-us/docs/tsm-reference.htm
Document APIhttp://tableau.github.io/document-api-python/
Documentation
AU T O M AT I O N R E L AT E D S E S S I O N S
Chalk Talk – Tableau Server Automation APIsOct-24 | 15:30 – 16:30
(Hands on Training) REST APIOct-23 | 14:15 – 16:45 Oct-24 | 10:15 – 12:45
Big Easy Data Security | Scalable…Roux Level SecurityOct-23 | 16:00 – 17:00 Oct-24 | 10:15 – 11:15
Using Tableau Server Client and the REST API…Oct-23 | 14:15 – 15:15 Oct-25 | 10:45 – 11:45
#DataDev Resources
TC18 Developer Track Contenthttp://tabsoft.co/tcdevtrack
Tableau Developer Programhttp://tableau.com/developer
Free environment for development
Early access to info and APIs
Tableau on GitHubhttp://github.com/tableau
Please complete the
session survey from the My
Evaluations menu
in your TC18 app
#DataDev Resources
TC18 Developer Track Contenthttp://tabsoft.co/tcdevtrack
Tableau Developer Programhttp://tableau.com/developer
Free environment for development
Early access to info and APIs
Tableau on GitHubhttp://github.com/tableau
Thank you!
#TC18
Contact or CTA info goes here