Trifacta client that makes it easy to integrate Trifacta into your production and data science workflows
- Jupyter: Invoke Trifacta jobs from a Jupyter notebook and pass data back and forth between Jupyter and Trifacta
- Other Notebooks: Integrate Trifacta with Azure Databricks, Zepellin or any other notebook-style interface that supports Python
- Scripts: Automate Trifacta jobs and input/output using python scripts that can be easily executed from the command line or called from an external scheduler
This library makes it simple to do the following:
- Connect to a Trifacta instance
- Run a job
- Download results to a csv file and view in pandas dataframe
Note that file uploads and downloads are performed using Amazon S3, using the boto3 API
#!pip install trifacta
import trifacta
If you need an access token, you can generate it as follows:
#Step 1: Connect to Trifacta by providing the URL and API Access Token
t = trifacta.Client('http://partnerdemo.amer.trifacta.net:3005', 'YOUR_ACCESS_TOKEN')
Make sure that you have run the job manually at least once
#Step 2: Run the job
t.run_job(23)
About to run job
{'sessionId': '9d339e65-8898-4165-871b-b9db848dc099', 'reason': 'JobStarted', 'jobGraph': {'vertices': [76, 77], 'edges': [{'source': 76, 'target': 77}]}, 'id': 42, 'jobs': {'data': [{'id': 76}, {'id': 77}]}}
2020-02-25 11:19:58.508231 InProgress
2020-02-25 11:20:03.700189 InProgress
2020-02-25 11:20:08.887794 Complete
True
%env AWS_PROFILE=trifacta_master_trial
env: AWS_PROFILE=trifacta_master_trial
#Step 3: Download results to a csv file and view in pandas dataframe
import boto3
s3 = boto3.client('s3', region_name='us-west-2')
s3.download_file(Bucket='trifacta-partnerdemo-trifactabucket-kkcpnw234feu',
Key='trifacta/queryResults/[email protected]/MarketingAnalytics.csv',
Filename='MarketingAnalytics.csv')
import pandas as pd
df = pd.read_csv('MarketingAnalytics.csv')
df.head()
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
user_id | customerkey | event_type | event_subtype | Date | advertiser_id | creative_id | url | product_id | domain_url | ... | customeraccount_number | customerphone | customeraddress | cusotmerstate | customerzipcode | customercountry | socialmedia | totalsale | Outlier_Identifier | currencykey | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1126310400000-424 | 1126310400000-424 | click | click | 10-19-2005 | 164332 | 543027 | http://zdnet.com/praesent/lectus/vestibulum/qu... | 1124064000000-475 | zdnet | ... | 310170445527596 | (817)718-7309 | 156 Cozy Berry Arc | CA | 78710 | USA | deneleaf | 7004.54 | False | 1 |
1 | 1229126400000-20 | 1229126400000-20 | click | click | 08-17-2009 | 164332 | 252030 | http://hostgator.com/a/feugiat.js?pid=12331008... | 1233100800000-528 | hostgator | ... | 310150240507900 | (469)201-1812 | 3641 Euismod Avenue | CA | 10769 | USA | kinphanng | 4853.35 | False | 1 |
2 | 1126828800000-518 | 1126828800000-518 | view | view | 04-05-2006 | 164332 | 562765 | http://fc2.com/convallis/duis/consequat/dui/ne... | 1121904000000-509 | fc2 | ... | 310170133079761 | (443)585-1769 | Ap #543-7410 Accumsan Rd. | CA | 92845 | USA | waldeelbailarin | 6885.15 | False | 1 |
3 | 1130112000000-336 | 1130112000000-336 | click | click | 04-05-2006 | 164332 | 466942 | http://biblegateway.com/est/phasellus/sit/amet... | 1130284800000-343 | biblegateway | ... | 310120073380564 | (215)669-3055 | 900-8123 Aliquam Av. | CA | 85517 | USA | charlrey | 2593.31 | False | 1 |
4 | 1121990400000-216 | 1121990400000-216 | view | view | 09-27-2005 | 164332 | 400316 | https://zdnet.com/elementum/nullam/varius/null... | 1108339200000-416 | zdnet | ... | 310160496868669 | 301 742 1112 | 164 Cozy Anchor Rd | CA | 60101 | USA | scottylago | 3958.25 | False | 1 |
5 rows × 31 columns