Releases: microsoft/msticpy
User Session Management, MaxMind Geolit fix, Extract nested dicts from Pandas
User Session Configuration
Do you always have one or more data providers or other components that you need to load for every notebook you create?
I do, and got a bit fed up with typing the same lines of code over and over again.
User session configuration lets you specify which providers are loaded, whether or not to connect and which parameters
to supply at load and connect time. You put all of this into a straightforward YAML file and load it using the following:
import msticpy as mp # you likely will already be doing this
mp.init_notebook() # and this
mp.load_user_session("my_config.yaml") # if you have a "mp_user_session.yaml" in the current directory
# you can skip the parameter
This example shows the structure of the YAML:
QueryProviders:
qry_prov_sent:
DataEnvironment: MSSentinel
InitArgs:
debug: True
Connect: True
ConnectArgs:
workspace: MySoc
auth_methods: ['cli', 'device_code']
qry_prov_md:
DataEnvironment: M365D
Components:
mssentinel:
Module: msticpy.context.azure
Class: MicrosoftSentinel
InitArgs:
Connect: True
ConnectArgs:
workspace: MySoc
auth_methods: ['cli', 'device_code']
The providers/components created (e.g. qry_prov_sent
in this example)
are published back to your notebook Python namespace, so you'll see
these available as variables ready to use.
This configuration file is equivalent to the following code:
qry_prov_sent = mp.QueryProvider("MSSentinel")
qry_prov_sent.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])
qry_prov_md = mp.QueryProvider("M365D")
from msticpy.context.azure import MicrosoftSentinel
mssentinel = MicrosoftSentinel()
mssentinel.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])
Not a huge saving, on the face of it, but if you create a lot of notebooks or want to use
msticpy in an automation scenario, it can be very helpful.
Include a verbose=True
parameter to load_user_session
to see more detailed logging of what is going on.
See the full documentation here
Maxmind GeoIPLite fix
Sometime recently (not too sure when) Maxmind changed their download procedure to use
a different URL and authentication mechanism. This was causing auto-update to fail. To use
the new mechanism you need to get your Maxmind User Account ID (login and look at your
account properties) and add that to your msticpyconfig.yaml
as shown below.
OtherProviders:
GeoIPLite:
Args:
AccountID: "1234567"
AuthKey:
EnvironmentVar: "MAXMIND_AUTH"
DBFolder: "~/.msticpy"
Provider: "GeoLiteLookup"
Extract nested dictionaries from pandas column to multiple rows/columns
@pioneerHitesh has added this as a new method in the mp_pivot
pandas extension:
data_df.mp_pivot.dict_to_dataframe(col="my_nested_column")
It returns a dataframe with the column recursively expanded:
- lists become new rows
- dictionaries become new columns
So a column with the following structure:
NCol | |
---|---|
0 | {'A': ['A1', 'A2', 'A3'], 'B': {'B1': 'B1-1', 'B2': 'B2-1'}} |
1 | {'A': ['A3', 'A4', 'A5'], 'B': {'B3': 'B3-1', 'B4': 'B4-1'}} |
my_df = src_df.mp_pivot.dict_to_dataframe(col="NCol")
my_df
Would be unpacked to:
A.0 | A.1 | A.2 | B.B1 | B.B2 | B.B3 | B.B4 | |
---|---|---|---|---|---|---|---|
0 | A1 | A2 | A3 | B1-1 | B2-1 | nan | nan |
1 | A3 | A4 | A5 | nan | nan | B3-1 | B4-1 |
What's Changed
- Authentication module unit test by @ianhelle in #800
- Use sessions config and GeoIP download failure by @ianhelle in #801
- Added Inbuilt function to extract nested JSON by @pioneerHitesh in #798
- Add max retry parameter to the execution prevent HTTP 429 by @vx3r in #802
New Contributors
- @pioneerHitesh made their first contribution in #798
- @vx3r made their first contribution in #802
Full Changelog: v2.13.1...v2.14.0
Hotfix for authentication error
We introduced a bug in azure_auth_core that caused Azure authentication to fail.
What's Changed
- Provider and lookup typing by @FlorianBracq in #795
- Fix for bug in azure_core_auth that fails authentication by @ianhelle in #799
Full Changelog: v2.13.0...v2.13.1
AI documentation assistant, BinaryEdge TI provider and other misc fixes
We've been quietly doing some work to introduce LLM/GPT/AI capabilities into msticpy.
@EileenG02 has helped us in that direction by building a document Q&A agent using Autogen.
You can try it out in a notebook using the following:
Load the magic extension
%load_ext msticpy.aiagents.mp_docs_rag_magic
Ask a question in a separate cell using the %%ask cell magic
%%ask
What are the three things that I need to connect to Azure Query Provider?
Awesome work @EileenG02!
There's also a new TI provider for BinaryEdge courtesy of @petebryan.
Alongside this there have been quite a few contributions to fix and improve things like:
- Splunk improvements (thanks @Tatsuya-hasegawa)
- Fixes for Sentinel provider get_alert_rules to use updated API (thanks @BWC-TomW)
- A massive amount of type annotation work and fixes to context/TI providers by @FlorianBracq
- Miscellaneous fixes to things like Sentinel TI provider, MSSentinel tidy-up to more consistently handle parameters,
correct use of the term CountryOrRegionName from CountryName in geolocation contexts.
The gory details of the PRs follow:
What's Changed
- Add extra tests and fixes to QueryProvider, DriverBase and (as)sync query handling by @FlorianBracq in #777
- Fix incorrect ref to ip_utils module in docs by @ianhelle in #779
- Fix some deprecation warnings by @FlorianBracq in #781
- Fixing np.NaN error and build warnings by @ianhelle in #785
- Removing data matching AV signatures by @ianhelle in #786
- Create codeql_updated.yml by @ianhelle in #787
- Update black requirement from <24.0.0,>=20.8b1 to >=20.8b1,<25.0.0 by @dependabot in #742
- Update docutils requirement from <0.20.0 to <0.22.0 by @dependabot in #768
- Add upload data styles to Splunk uploader by @Tatsuya-hasegawa in #776
- Added BinaryEdge provider by @petebryan in #780
- Update sentinel_analytics.py to update get_alert_rules to use new API version by @BWC-TomW in #789
- Fixing MSSentinel to obey parameters by @ianhelle in #791
- Add Autogen and RAG Agent to MSTICpy by @EileenG02 in #793
- Update TILookup and ContextLookup by @FlorianBracq in #794
- Fix sentinel TI provider by @ianhelle in #797
- Updating CountryName to CountryOrRegionName by @ianhelle in #796
New Contributors
- @BWC-TomW made their first contribution in #789
- @EileenG02 made their first contribution in #793
Full Changelog: v2.12.0...v2.13.0
Splunk and Sentinel Updates
Sentinel updates
WorkspaceConfig and Sentinel QueryProvider (azure_monito_driver) have had a few updates:
- handle both old (Kqlmagic) and standard connection string formats in WorkspaceConfig
- removing a lot of legacy code from WorkspaceConfig
- Allow additional connection parameters to be used with MSSentinel QueryProvider for
authentication parameters (e.g. you can now supply authentication parameters like "client_id", "client_secret" toquery_provider.connect()
) msticpyconfig.yaml
now supports using an "MSSentinel" key in place of "AzureSentinel"- Workspace entries in msticpyconfig.yaml support an
Args
subkey, where you can add authentication parameters - these will be supplied to theconnect()
method if not overridden on the command line. Like Args sections for other providers, the values here can be text or references to environment variables or Azure Key Vault secrets. - Fix to MSSentinel API update_incident to add full properties
Splunk Updates
- Added jwt authentication token expiry check.
Other fixes
Fix for vtlookup3.py
- Fixed problematic way of using nestasyncio - this was causing failures when run from a langchain agent.
Fix for lookup/tilookup - If the progress parameter was not passed it would still try to cancel a non-existent progress task and cause an exception.
QueryProviders - Fix split query time-ranges calculation - thanks to @pjain90 for spotting this.
What's Changed
- Set up CI with 1ES Azure Pipelines by @ianhelle in #763
- Update ws_config to handle kqlmagic connection strings by @ianhelle in #767
- Fix split query time-ranges calculation by @ianhelle in #762
- Add support for ruff and u/p devcontainer by @ianhelle in #765
- Add jwt auth token expire check and modify some messages when connecting Splunk by @Tatsuya-hasegawa in #770
- WSConfig updates by @ianhelle in #771
- Pass
true
for props into_build_sent_data
when callingupdate_incident
by @kylelol in #774 - Changing cert thumbprint from Sha1 to Sha256 in Az Kusto driver by @ianhelle in #775
New Contributors
Full Changelog: v2.11.0...v2.12.0
Sentinel Split Query fix
This is a minor release mainly to add a warning for Kusto/Sentinel queries that return partial results.
A close friend of MSTICPy (thx @Cyb3r-Monk) had spotted that MSTICPy does not report partial results when doing split queries so it's possible to lose data from the query range silently.
Due to an unfortunate admin error, the fix for this was committed direct to main, so no PR for this is available. :-(
If you want the query to fail (throw an exception) rather than just warn you can supply a new parameter fail_if_partial
.
This only affects the Sentinel query provider and works for standard as well as split queries.
NOTE: the documentation has a typo and calls this fail_on_commit
- we'll fix that in the next release to support both fail_if_partial
and fail_on_partial
Example
qry_prov.exec_query(query_string, fail_if_partial=True)
What's Changed
- Missing PR for partial query warning and fixes for pandas deprecation warnings See the diff for changes
- Fixing group.apply for pandas < 2.2.1 by @ianhelle in #759
- Added missing quotation in code block by @ryan-aus in #753
- Bump httpx from 0.25.2 to 0.27.0 by @dependabot in #754
- Bump readthedocs-sphinx-ext from 2.2.3 to 2.2.5 by @dependabot in #743
- Updated conda reqs files for new packages by @ianhelle in #758
- Build break fix for splunk SDK by @ianhelle in #760
New Contributors
Full Changelog: v2.10.0...v2.11.0
v2.10.0
What's Changed
- Add nest_asyncio to run threaded queries by @FlorianBracq in #737
- Bump sphinx-rtd-theme from 1.3.0 to 2.0.0 by @dependabot in #738
- Bump httpx from 0.25.0 to 0.25.2 by @dependabot in #736
- Adding Virus Total Search Capabilities by @secops-account in #739
- Add security token auth and credential loading from msticpyconfig.yaml to SplunkUploader by @Tatsuya-hasegawa in #731
- fix: updated _get_query_status in the azure monitor driver by @aka0 in #745
- Added M365DGraph to the supported environments for existing queries by @d3vzer0 in #748
- Small Typo correction in SentinelWatchlists.rst by @Korving-F in #746
- Fix ibm_xforce TI provider for domain names and URLs by @pcoccoli in #749
- Update python-package.yml by @ianhelle in #750
- Ianhelle/aml updates 2024 01 31 by @ianhelle in #751
- Ianhelle/warning fixes 2024 02 11 by @ianhelle in #752
New Contributors
- @secops-account made their first contribution in #739
- @aka0 made their first contribution in #745
- @Korving-F made their first contribution in #746
- @pcoccoli made their first contribution in #749
Full Changelog: v2.9.0...v2.10.0
Defender Advanced hunting, IPQualityScore TI provider
Some of the highlights of this release:
IPQualityScore
New TI provider submitted by @petebryan - provides a lot of interesting stats on IPs.
Defender Advanced Hunting API
Thanks to @d3vzer0 our MS Defender client is now able to use the support Graph-based API rather than the legacy
APIs. To use this, for the moment use the DataEnvironment name M365DGraph
when you create
query provider. In the next 0.x release we will switch the other aliases for M365D, MDE, MDATP to use this
new interface and deprecate the existing ones.
Startup errors when running in unexpected environments.
init_notebook
made some (incorrect) assumptions about when it would be running in a Synapse environment.
Azure Machine Learning have recently changed their default compute to be a Synapse environment.
Fixes here will correct failures due to faulty detection of environment type.
Startup fixes and perf improvements
We've optimized some of the imports done within the package at startup so msticpy should be quicker to
load.
Azure env credentials fix
Although we previously supported the Azure EnvironmentCredential credential type, our implementation allowed
you to use only with ClientID + ClientSecret. The changes allow it to be used with other supported
credential formats - notably username + password and certificate authentication using a certificate file.
Improvements to Entities
Although these are not visible to most people, we try to keep our Entity definitions in sync with the official
Microsoft "V3" entity definitions. We've added a few entity types and updated some of the attributes
to bring this in line, while still allowing backwards compatible attributes to be used.
What's Changed
- Ianhelle/entity updates 2023 09 01 by @ianhelle in #718
- Ianhelle/lazy-import-init-2023-09-26 by @ianhelle in #717
- Fix Azure env credential authentication by @ianhelle in #722
- Update documentation for installing in isolated env by @ccianelli22 in #724
- Bump isort to 5.12.0 in pre-commit config by @2xyo in #723
- Remove stack trace from logging by @FlorianBracq in #729
- fix: init_notebook and entities by @ianhelle in #730
- Fix time span values by @FlorianBracq in #728
- Added additional DataProvider for Advanced Hunting via Graph by @d3vzer0 in #725
- Allow POST HTTP method by @2xyo in #726
- Bump readthedocs-sphinx-ext from 2.2.2 to 2.2.3 by @dependabot in #716
- Added new TI Provider - IPQualityScore by @petebryan in #733
New Contributors
Full Changelog: v2.8.0...v2.9.0
Stability release
A few bugs had crept in over the last couple of releases: some due to buggy coding, some due the world moving forward. So, many items in this release are to address these.
Among the feature improvements are the following:
- Documentation and scripts from @ccianelli22 for creating a MSTICPy install for use in isolated (no Internet) environments. This is super useful for customers operating in sovereign clouds or other air-gapped high-security environments.
- Added Splunk authentication method using security token rather than username/password - thanks @Tatsuya-hasegawa
- Query yaml file validation by @FlorianBracq
- Paging for large CyberReason queries by @FlorianBracq
- Modern method to obtain cloud-specific URL endpoints for Azure services. Previously, we were relying on msrestazure, which is now deprecated for this purpose. Many thanks to @ccianelli22 for the work to do this.
- Fix (by me) for a bug I'd introduced with the switch to using Azure-monitor-query library for MS Sentinel. When using a connection string with this new driver, the logic failed to parse and extract details from this correctly. Many thanks to @cindraw for reporting this bug.
What's Changed
- Update mde_proc_pub.pkl by @FlorianBracq in #709
- Update Introduction.rst by @praveenjutur in #700
- Update methodology of getting endpoints for cloud environment by @ccianelli22 in #704
- Validation of the YAML structure of query files by @FlorianBracq in #660
- Intsights api update by @FlorianBracq in #710
- Fix m365d/mde hunting query options by @Tatsuya-hasegawa in #702
- Cybereason pagination support + multi-threading by @FlorianBracq in #707
- Add bearer token auth to splunk driver by @Tatsuya-hasegawa in #708
- fix wl bug when creating a new wl when wl count is 0 by @ccianelli22 in #719
- Update installation docs to include installation for isolated envs by @ccianelli22 in #715
- Fixing regular expression error for connection string in WorkspaceConfig by @ianhelle in #706
- Fix documentation formatting, update steps for downloading msticpy by @ccianelli22 in #720
New Contributors
- @praveenjutur made their first contribution in #700
- @ccianelli22 made their first contribution in #704
Full Changelog: v2.7.0...v2.8.0
2.8.0 pre-release
Updated method to dynamically fetch Azure endpoints (rather than relying on deprecated msrestazure).
Updated version of Insight data provider
TI Providers, Sentinel/Kusto Drivers, Query Editor
Main Changes in this release
Two new TI Providers
Two cool new providers to add to the growing family in MSTICPy:
- CrowdSec is a commercial Malicious IP threat service
with a free tier for limited threat lookups. (big thanks to @sbs2001 for submitting this) - AbuseIPDB - is an open/free provider of threat intel
on malicious IP addresses, providing a central abuse list to lookup IP addresses that have
been associated with malicious activity. (big thanks to @rrevuelta for submitting this.)
As with other providers, these are automatically enabled for use if you include settings
for the API keys in your msticpyconfig.yaml
Updated Data providers for Sentinel/Azure Monitor/Log Analytics and Kusto/Azure Data Explorer
In v2.5.0 we introduced replacement drivers for the MS Sentinel/LogAnalytics/Azure Monitor
and Kusto/Azure Data Explorer providers.
The new drivers are based on the Azure SDKs for each data service. You can read the release notes
for them here.
The new drivers give several advantages, like being able to run queries across multiple workspaces
or Kusto clusters in parallel. Splitting large queries by time chunks (split_query_by
parameter)
will also run multiple segments in parallel, dramatically speeding up the query. The default
parallelism is 4 simultaneous threads but you can change this (although be wary of the impact
on the data service for highly parallel queries - this may affect other users and services accessing
the data).
The new drivers are now the default drivers for these providers. They are used by default for
the "MSSentinel" and "Kusto" data environment identifiers. For backward compatibility, they will
also continue to support the "MSSentinel_New" and "Kusto_New" identifiers.
To invoke the previous Kqlmagic-based drivers use "MSSentinel_Legacy" or "Kusto_Legacy".
This change also brings a dependency change for MSTICPy. The following packages are now
part of the core installed dependencies:
- azure-kusto-data
- azure-monitor-query
Kqlmagic and its dependencies are no longer installed by default but can be installed with the "kql" extra:
python -m pip install msticpy[kql]
See these links to read more about the MSSentinel provider and Kusto providers.
Query Editor
We've added an ipywidgets based query template editor .
note: this is somewhat provisional so please be sure to test and report bugs.
The query editor allows you to edit existing query files or create new ones and helps manage
the various query properties (like parameter definitions) and query metadata.
Check out the documentation on how to use this in the Extending section of the MSTICPy documentation.
Updates to Authentication.
The improvements here mainly affect the AzureData and MicrosoftSentinel classes but'
also bring some improvements to the core authentication - such as being able to specify
the Azure cloud
from the az_connect function and authenticate by providing an
AzureCredential
.
- You can now authenticate by supplying an AzureCredential as a
credential
parameter
for AzureData and MicrosoftSentinelconnect
methods. - The
connect
methods for both these classes also supportcloud
parameter to specify different sovereign clouds - The
__init__
andconnect
methods are instrumented with logging to help debug issues:
import msticpy as mp
from msticpy.context.azure.sentinel_core import MicrosoftSentinel
mp.set_logging_level("INFO")
mssentinel = MicrosoftSentinel()
mssentinel.connect()
Other major items
- MS Sentinel delete watchlist API added by @mbabinski
- Splunk fixes added by @Tatsuya-hasegawa
Thanks
Our thanks to the following folks who contributed to this release.
@FlorianBracq
@sbs2001
@rrevuelta
@mbabinski
@Tatsuya-hasegawa
What's Changed
- Add CrowdSec TIProvider by @sbs2001 in #673
- Added delete_watchlist_item method by @mbabinski in #682
- Update pandas requirement from <2.0.0,>=1.4.0 to >=1.4.0,<3.0.0 by @dependabot in #653
- Bump sphinx from 6.1.3 to 7.1.0 by @dependabot in #686
- Add AbuseIPDB TIProvider by @rrevuelta in #687
- Typo corrections in queries by @ianhelle in #684
- Ianhelle/query editor 2023 04 21 by @ianhelle in #685
- Few fix splunk driver by @Tatsuya-hasegawa in #688
- Ianhelle/mssentinel auth 2023 08 01 by @ianhelle in #690
- Updating timeline docs to prioritize pd accessors by @ianhelle in #691
- Fix splunk uploader create index option by @Tatsuya-hasegawa in #692
- v2.7.0 - changing new kql/sentinel drivers to be defaults by @ianhelle in #696
New Contributors
- @sbs2001 made their first contribution in #673
- @mbabinski made their first contribution in #682
Full Changelog: v2.6.0...v2.7.0