This is a library for making batch request to Google Analytics Core Reporting v3 API and extracting data from Google Analytics property into Python 3 data structures.
The package uses
-
OAuth 2.0 (protocol) client or server access to Google Analytics API (oauth2client==3.0.0) - for connection to Google Analytics
-
Core Reporting v3 API Google Analytics - for extracting data
-
Metadata API Google Analytics - integrated dimensions or metrics reference lookup
-
Management API Google Analytics - to get View, Property and Account tree.
Dependency:
-
Pandas > 0.13.0 - for transformation data into pandas DataFrame object
-
Numpy > 1.0.0 - for slice numpy array chunk
-
google-api-python-client > 1.5.0 - self explanatory
Best practices usage:
- Interactive shell Jupyter for analyzing data
- Via pip: use the following command: # sudo pip install pga
Latest version of Pandas, Numpy and oauth2client will be automatically installed as a dependency.
First of all you will need to get google client_secret json file from Google API Console
You may choose the following types of Client ID :
-
for Service account client
-
for Web application
PGA.init(key_file_location=None,type_of_connection=None,facet_chunk=10,count_day_slice=1)
Constructor and set parameters for instance basic functionality.
Parameters: | key_file_location : string Set path for secret json file type_of_connection : string Available methods are Client’, ‘Server’ If use service account, then choose ‘Server’, if use web applicatio use ‘Client.’ facet_chunk : int, optional Set a number of chunk,which execute all parallels request. More detail about this technology. Important things - Google Universal Analytics make execute only 10 parallel request in one second, if you want more - contact with a Google form to increase this limit. count_day_slice : int, optional Set a number of days,which need to slice [start-date, end-date] in your request. For example: (input) {‘count_day_slice’:2, 'start_date' : '2016-12-01','end_date' : '2016-12-05'} (output) [{ 'start_date' : '2016-12-01','end_date' : '2016-12-02'}, { 'start_date' : '2016-12-03','end_date' : '2016-12-04'}, { 'start_date' : '2016-12-05','end_date' : '2016-12-05'}] |
Returns: | self : self return self with current behavior. |
After apply constructor will be create the instance, and redirect the client to a browser for authentication with Google.
Simply add request in an already instantiated object pga
Request**.add_settings_request(****settings_products)
Parameters: | **settings_products : kwargs Specify json request formats Core V3, list of query parameters - https://developers.google.com/analytics/devguides/reporting/core/v3/reference?hl=ru#q_summary |
Returns: | self : self return self with current behavior. |
You can update any already used query parameters later with the following method, and make new request. ![image alt text]
Execute all settings for get DataFrame
PGA.get_dataframe(groupby=True)
Parameters: | groupby : boolean Available methods are ‘True’, ‘False’ if choose True then DataFrame groupby all date by all dimensions, dates, and start-index. Also all columns apply appropriate type based on Google Analytics MetaData API. if choose False then DataFrame doesn’t groupby data. It made for use some other library which can fast aggregate and groupby data, because in some cases data is too large and this process is very low. You may pay attention in to this project - http://dask.pydata.org/en/latest/ |
Returns: | data : pandas.DataFrame object |
All settings
Print all current settings pga:
PGA.get_all_settings()
Returns: | all settings : pandas.DataFrame object |
All products
Print all current product settings pga
PGA.get_all_products()
Returns: | all settings : pandas.DataFrame object |
ExtraAppsMetaCdm
Lookup through metadata of Google Analytics dimensions and metrics:
ExtraAppsMetaCdm.get_list_cdcm(clarify=None)
Parameters: | clarify : string Specifying the attribute on which the selection will be dimensions and metris |
Returns: | Table of information : pandas.DataFrame object |
ExtraAppsManagementAPI
Get the list of Google Universal Analytics (Account ID, Property id, View id) objects, you have an access to.
PGA.get_all_profile()
Returns: | Table of information with dimensions or metrics: pandas.DataFrame object |