MSTICPy Version 2.0 #466
Locked
ianhelle
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
MSTICPy Release 2.0
A notebook containing some of the features of MSTICPy 2.0
is available at What's new in MSTICPy 2.0
If you are new to MSTICPy or just want to catch up and get a quick
overview check out our new MSTICPy Quickstart Guide.
Contents
Dropping Python 3.6 support
As of this release we only officially support Python 3.8 and above.
We will try to support Python 3.6 if the fixes required are small
and contained but make no guarantees of it working completely on
Python prior to 3.8.
Package re-organization and module search
One of our main goals for V2.0.0 was to re-organize MSTICPy to be more logical and easier to
use and maintain. Several years of organic growth had seen modules created in places that
seemed like a good idea at the time but did not age well.
The discussion about the V2 structure can be found here #320.
Due to the re-organization, many features are no longer in places
where they used to be imported from!
We have tried to maintain compatibility with old locations by adding "glue" modules.
These allow import of many modules from their previous locations but will issue a
Deprecation warning if loaded from the old location.
The warning will contain the new location of the module -
so you should update your code to point to this new location.
This table gives a quick overview of the V2.0 structure
Notable things that have moved:
sectools
folder have migrated to context, transform or analysisnbtools
folder have migrated to:msticpy.init
- (not to be confused with__init__
) - package initializationmsticpy.vis
- visualization modulesmsticpy.init
Module Search
If you are having trouble finding a module, we have added a simple search function:
Matches will be returned in a table with links to the module documentation
Modules matching 'riskiq'
Simplifying imports in MSTICPy
The root module in MSTICPy now has several modules and
classes that can be directly accessed from it (rather than
having to import them individually).
We've also decided to adopt a new "house style" of importing
msticpy
as the aliasmp
. Slavishly copying the idea fromsome of admired packages that we use (pandas ->
pd
,numpy ->
np
, networkx ->nx
) we thought it would savea bit of typing. You are free to adopt or ignore this style -
it obviously has no impact on the functionality.
Many commonly-used classes and functions are exposed as
attributes of msticpy (or
mp
).Also a number of commonly-used classes are imported by default
by
init_notebook
, notably all of the entity classes.This makes it easier to use pivot functions without any initialization
or import steps.
init_notebook
improvementsnamespace=globals()
parameter whencalling from a notebook.
init_notebook
will automatically obtain thenotebook global namespace and populate imports into it.
init_notebook
is now 0, which producesminimal output - use
verbosity=1
orverbosity=2
to get moredetailed reporting.
init_notebook
.require optional packages, such as the timeseries accessors are
not initialized by default).
init_notebook
supports aconfig
parameter - you can use this toprovide a custom path to a
msticpyconfig.yaml
overriding the usualdefaults.
config.json
file is only enabled if you are runningMSTICPy in Azure Machine Learning.
Folium map update - single function, layers, custom icons
The Folium module in MSTICPy has always been a bit complex to use
since it normally required that you convert IP addresses to MSTICPy
IpAddress entities before adding them to the map.
You can now
plot maps with a single function call from a DataFrame containing
IP addresses or location coordinates. You can group the data
into folium layers, specify columns to populate popups and tooltips
and to customize the icons and coloring.
plot_map
A new
plot_map
function (in the msticpy.vis.foliummap module) thatlets you plot mapping points directly from a DataFrame. You can
specify either an
ip_column
or coordinates columns (lat_column
andlong_column
). In the former case, the geo location of the IP addressis looked up using the MaxMind GeoLiteLookup data.
You can also control the icons used for each marker with the
icon_column
parameters. If you happen to have a column in yourdata that contains names of FontAwesome or GlyphIcons icons
you can use that column directly.
More typically, you would combine the
icon_column
with theicon_map
parameter. You can specify either a dictionary or afunction. For a dictionary, the value of the row in
icon_column
is used as a key - the value is a dictionary of icon parameters
passed to the Folium.Icon class. For a function, the
icon_column
value is passed to the function as a single parameter and the return value
should be a dictionary of valid parameters for the
Icon
class.You can read the documentation for this function in the
docs
plot_map pandas accessor
Plot maps from the comfort of your own DataFrame!
Using the msticpy
mp_plot
accessor you can plot maps directlyfrom a DataFrame containing IP or location information.
The
folium_map
function has the same syntax asplot_map
except that you omit the
data
parameter.Layering, Tooltips and Clustering support
In
plot_map
and.mp_plot.folium_map
you can specifya
layer_column
parameter. This will group the databy the values in that column and create an
individually selectable/displayable layer in Folium. For performance
and sanity reasons this should be a column with a relatively
small number of discrete values.
Clustering of markers in the same layer is also implemented by
default - this will collapse multiple closely located markers
into a cluster that you can expand by clicking or zooming.
You can also populate tooltips and popups with values
from one or more column names.
"Classic" interface
The original FoliumMap class is still there for more manual
control. This has also been
enhanced to support direct plotting from IP, coordinates or GeoHash
in addition to the existing IpAddress and GeoLocation entities.
It also supports layering and clustering.
Threat Intelligence providers - async support
When you have configured more than one TI provider, MSTICPy will
execute requests to each of them asynchronously. This will bring big
performance benefits when querying IoCs from multiple providers.
Note: requests to individual providers are still executed synchronously
since we want to avoid swamping provider services with multiple
simultaneous requests.
We've also implemented progress bar tracking for TILookups, giving a visual
indication of progress when querying multiple IoCs.
Combining the progress tracking with asynchronous operation means
that not only is performing lookups for lots of observables faster
but you will also less likely to be left guessing whether or not your kernel
has hung.
Note that asynchronous execution only works with
lookup_iocs
and TI lookupsdone via the pivot functions.
lookup_ioc
will run queries to multiple providers in seqenceso will usually be a lot slower than
lookup_iocs
.TI Providers are now also loaded on demand - i.e. only when you have
a configuration entry in your msticpyconfig.yaml for that provider.
This prevents loading of code (and possibly import errors) due to providers
which you are not intending to use.
Finally, we've added functions to enable and disable providers
after loading TILookup:
Time Series simplified - analysis and plotting
Although the Time Series functionality was relatively simple to
use, it previously required several disconnected steps to compute
the time series, plot the data, extract the anomaly periods. Each of
these needed a separate function import. Now you can do all of these
from a DataFrame via pandas accessors.
(currently there is a separate accessor
df.mp_timeseries
but we arestill working on consolidating our pandas accessors so this may change
before the final release.)
Because you typically still need these separate outputs, the accessor
has multiple methods:
df.mp_timeseries.analyze
- takes a time-summarized DataFrameand returns the results of a time-series decomposition
df.mp_timeseries.plot
- takes a decomposed time-series andplots the anomalies
df.mp_timeseries.anomaly_periods
- extracts anomaly periodsas a list of time ranges
df.mp_timeseries.anomaly_periods
- extracts anomaly periodsas a list of KQL query clauses
df.mp_timeseries.apply_threshold
- applies a new anomalythreshold score and returns the results.
See documentation
Analyze data to produce time series.
Analyze and plot time series anomalies
Analyze and retrieve anomaly time ranges
DataFrame to graph/network visualization
You can convert a pandas DataFrame into a NetworkX graph or
plot directly as a graph using Bokeh interactive plotting.
You pass the functions the column names for the source and target nodes to build a basic graph. You can also name other columns to be node or edge attributes. When displayed these attributes are visible as popup details courtesy of Bokeh’s Hover tool.
Pivots - easy initialization/dynamic data pivots
The pivot functionality has been overhauled. It is now initialized
automatically in
init_notebook
- so you don't have to importand create an instance of Pivot.
Better data provider support
Previously, queries from
data providers were added at initialization of the Pivot subsystem.
This meant that you had to:
re-initialize Pivot.
Data providers now dynamically add relevant queries as pivot
functions when you authenticate.
Multi-instance provider support
Some query providers (such as MS Sentinel) support multiple instances.
Previously this was not well supported in Pivot functions - the last
provider loaded would overwrite the queries from earlier providers. Pivot now
supports separate instance naming so that each Workspace has a
separate instance of a given pivot query.
Threat Intelligence pivot functions
The naming of the Threat Intelligence pivot functions has been
simplified considerably.
VirusTotal and RiskIQ relationship queries should now be available as
pivot functions (you need the VT 3 and PassiveTotal packages installed
respectively to enable this functionality).
More Defender query pivots
A number of MS Defender queries (using either the MDE or MSSentinel
QueryProviders) are exposed as Pivot functions.
Consolidating Pandas accessors
Pandas accessors let you extend a pandas DataFrame or Series with
custom functions. We use these in MSTICPy to let you call analysis or
visualization functions as methods of a DataFrame.
Most of the functions previously exposed as pandas accessors, plus
some new ones, have been consolidated into two main accessors.
mp
accessormp_plot
accessorExample usage (note: the required parameters, if any, are not shown)
One of the benefits of using accessors is the ability to
chain them into a single pandas expression (mixing
with other pandas methods).
MS Sentinel workspace configuration
From
MPConfig
edit you can more easily import and updateyour Sentinel workspace configuration.
Resolve Settings
If you have a minimal configuration (e.g. just the Workspace ID and Tenant ID)
you can retrieve other values such as Subscription ID, Workspace Name
and Resource Group and save them to your configuration using the Resolve
Settings button
Import Settings from URL
You can copy the URL from the Sentinel portal and paste it into the
the MPConfigEdit interface. It will extract and lookup the full
details of the workspace to save to your settings.
Expanded Sentinel API support
The functions used to implement the above functionality are
also available standalone in the MSSentinel class.
MS Defender queries available in MS Sentinel QueryProvider
Since Sentinel now has the ability to import Microsoft Defender data, we've
made the Defender queries usable from the MS Sentinel provider.
This is a more general functionality that allows us to share
compatible queries between different QueryProviders.
Many of the MS Defender queries are also now available as Pivot functions.
Microsoft Sentinel QueryProvider
The MS Sentinel provider now support a timeout parameter allowing you
lengthen and shorten the default.
You can set other options supported by Kqlmagic when initializing
the provider
You can specify a workspace name as a parameter when connecting
instead of creating a WorkSpaceConfig instance or supplying
a connection string. To use the Default workspace supply "Default"
as the workspace name.
New queries
Several new Sentinel and MS Defender queries have been added.
Over 30 MS Defender queries can now also be used in MS Sentinel workspaces if
MS Defender for Endpoint/MS Defender 365 data is connected to Sentinel
Additional Azure resource graph queries
See the updated built-in query list
Documentation Additions and Improvements
The documentation for V2.0 is available at https://msticpy.readthedocs.io
(Previous versions are still online and can be accessed through
the ReadTheDocs interface).
New and updated documents
API documentation
As well as including all of the new APIs, the API documentation has
been split into a module-per-page to make it easier to read and navigate.
InterSphinx
The API docs also now support "InterSphinx".
This means that MSTICPy references to objects in other packages (e.g. Python
standard library, pandas, Bokeh) have active links that will take you
to the native documentation for that item.
Sample notebooks
The sample notebooks for most of these features have been updated
along the same lines. See MSTICPy Sample notebooks
There are three new notebooks:
ContiLeaks notebook added to MSTICPy Repo
We are privileged to host Thomas's awesome ContiLeaks notebook that
covers investigation into attacker forum chats including
some very cool illustration of using natural language translation
in a notebook.
Thanks @fr0gger!
Miscellaneous improvements
can identify or track requests from MSTICPy/Notebooks.
to download data files at initialization - only on first use.
that you initialize it with are the same
especially deprecation warnings.
Feedback
Please reach out to us on GitHub - file an issue or start a discussion on
https://github.com/microsoft/msticpy - or [email protected]
Previous feature changes since MSTICPy 1.0
This discussion was created from the release MSTICPy Version 2.0.
Beta Was this translation helpful? Give feedback.
All reactions