The OCI Data Science service uses conda environments to manage the available libraries that a notebook can use. OCI The Data Science service provides a number of conda environments designed to give you the best in class libraries for common data science tasks. A family can consist of one or more conda environments. Each family of conda environments has notebooks that demonstrate how to perform different data science tasks. This section is organized around these conda environment families and provides the notebooks that you need to get you started quickly.
- data_exploration_and_manipulation: The Data Exploration and Manipulation conda environment family gives you the tools that you need to perform exploratory data analysis and develop a deeper understanding of the data that you are working with.
- natural_language_processing: The Natural Language Processing conda environment family provides the libraries to perform cutting edge NLP tasks.
- onnx: ONNX is a standard format to represent machine learning models. It is also the preferred format for storing models in the OCI Data Science service. The ONNX conda environment family enables you to work with ONNX files.
- oracle_database: The Oracle Database conda environment family is focused on the tools that are needed it interact with databases in general. However, there is an emphasis on using the Oracle Autonomous Databases.
- pypgx: PyPGX is a graph toolkit based on the Parallel Graph AnalytiX (PGX libraries. The PyPGX conda environment family provides a graph query language (PGQL), optimized analytics algorithms, and graph machine learning tools.
- pyspark: The PySpark conda environment family allows you to create and run PySpark operations within the notebook session. It is also a great create and debug PySpark application before submitting them to the OCI Data Flow service which is OCI's Spark service.
- pytorch: PyTorch is a machine learning library that is used in applications such as NLP, computer vision, and much more. The PyTorch conda environment family supports CPU and GPU versions.
- rapids: RAPIDS is a GPU-only library that is designed for data science workflows. The RAPIDS conda environment family enables you to make the most out of OCIs NVIDIA GPU based computing.
- tensorflow: TensorFlow is a machine learning platform that is focused on deep neural networks. The TensorFlow conda environment family has support for CPU and GPU.