Here’s a curated list of the tools and resources that support my tech journey. From development to testing and database management, these are the essentials I rely on to get the job done efficiently. 🙌 Dive in and explore the tools that enhance my workflow.
- anaconda-extension-pack: Set of extensions that enhance the experience of Anaconda customers using Visual Studio Code
- AREPL for python: Real-time python scratchpad
- autocomplete-shell: Autocompletion for bash script in vscode
- autodocstring: Quickly generate docstrings for python functions
- Beanie: Asynchronous Python object-document mapper (ODM) for MongoDB. Data models are based on Pydantic.
- better-comments: Color code comments based on TODO/Alert/Warning etc.
- bookmarks: Bookmarks lines in code and jump to them
- code-spell-checker: Catch common spelling errors in codebase
- docker: Adds syntax highlighting, commands, hover tips, and linting for Dockerfile and docker-compose files.
- gc-excelviewer: View excel and CSV files inside Vscode
- git history diff: View git history. View diff of committed files. View git blame info. View stash details.
- gitblame: See git blame information in the status bar.
- githistory: View git log, file history, compare branches or commits
- gitignore: Auto create .gitignore files for various languages
- gitlens: Supercharge the Git capabilities built into Visual Studio Code
- IntelliCode: Provides AI-assisted development features for Python, TypeScript/JavaScript and Java developers in Visual Studio Code, with insights based on understanding your code context combined with machine learning.
- guides: Guides is simply an extension that add various indentation guide lines
- Kite: AI code completions for all languages, intellisense, code snippets, code signatures, and cursor-following documentation for VS Code
- live server: Launch a development local Server with live reload feature for static & dynamic pages
- LaTeX Workshop: Boost LaTeX typesetting efficiency with preview, compile, autocomplete, colorize, and more.
- local-history: A visual source code plugin for maintaining local history of files.
- Marp: Launch VS Code Quick Open (Ctrl+P), paste the following command, and press enter.
- Material Icon Theme: Material Design Icons for Visual Studio Code
- mssql: Visual Studio Code SQL Server extension.
- output-colorizer: Syntax highlighting for log files
- path-autocomplete: Provides path completion for visual studio code.
- path-intellisense: Visual Studio Code plugin that autocompletes filenames
- pdf: Display pdf file in VSCode
- prettier: Code formattter for various languages
- Pylance: A performant, feature-rich language server for Python in VS Code
- python: Linting, Debugging (multi-threaded, remote), Intellisense, code formatting, refactoring, unit tests, snippets, and more.
- python-extended-snippets: Python Extended is a vscode snippet that makes it easy to write codes in python by providing completion options along with all arguments.
- python-extension-pack: All in one package of popular Visual Studio Code extensions for Python
- rainbow-brackets: Provide rainbow colors for the round brackets, the square brackets and the squiggly brackets.
- remote-ssh: Open any folder on a remote machine using SSH and take advantage of VS Code's full feature set.
- REST Client: REST Client allows you to send HTTP request and view the response in Visual Studio Code directly
- rewrap: Re-wraps comments and other text to a given line length.
- settings sync: Sync settings of vscode using github gists
- shell-format: Code support for shellscript、Dockerfile、properties、gitignore、dotenv、hosts、jvmoptions... DocumentFormat
- tabnine-vscode: All-language autocompleter — TabNine uses machine learning to help you write code faster.
- theme-dracula: Dark theme for vscode
- vim: Vim bindings support in vscode
- vscode-django: Beautiful syntax and scoped snippets for Django
- vscode-icons: Change file icons in vscode
- vscode-markdownlint: Markdown linting and style checking for Visual Studio Code
- vscode-pull-request-github: Review and manage your GitHub pull requests directly in VS Code
- vscodeintellicode: AI-assisted development features for Python, TypeScript/JavaScript and Java developers in Visual Studio Code, with insights based on understanding your code context combined with machine learning.
- File Utils: A convenient way of creating, duplicating, moving, renaming and deleting files and directories.
- Advanced New File: Create files anywhere in your workspace from the keyboard
- CodeSnap: Take beautiful screenshots of your code
- JSON Crack: Seamlessly visualize your JSON data instantly into graphs.
- nerdtree: Project and file navigation
- tagbar: An easy way to browse the class and modules
- vim-surround: matches parentheses, brackets, quotes, XML tags, and more
- vim-commentary: Comment/uncomment lines according to filetype
- python-mode: Python support in vim
- vim-fzf: fuzzy finder
- YouCompleteMe: Fast, as-you-type, fuzzy-search code completion engine
- ctrl-space: tabs,buffers,files management and fast fuzzy searching
- syntastic: Syntax checking plugin
- SQL Server Management Studio (SSMS)
- MongoDB
- MotorClient
- SQLite
- MySQL
- PostgreSQL
- Oracle Database
- Redis
- Cassandra
- Elasticsearch
- alembic
- Omniverse Audio2Face: Instantly create expressive facial animation from just an audio source using generative AI.
- MetaHuman: Complete framework that gives any creator the power to use highly realistic human characters in any way imaginable.
- Docker
- Kubernetes
- Lens
- Flux
- Kubesphere
- Jenkins
- rancher
- travis-yml
- AWS Lambda
- AWS EMR
- AWS SQS
- AWS ECR
- AWS S3
- Slurm: Managing and scheduling Linux clusters.
- Tmux: Open-source terminal multiplexer for Unix-like operating systems
- bash: control OS without having to navigate menus, options, and windows within a GUI
- ZSH:Unix shell that is built on top of bash
- Emacs: An extensible, customizable, free/libre text editor
- Mendely: Reference manager and academic social network that can help you organize your research.
- Zotero: Zotero is a free, easy-to-use tool to help you collect, organize, cite, and share research.
- Symlink: Points to another file or folder on your computer, or a connected file system
- dotfiles: Control the settings and preferences for applications and your system environment
- GNU Stow: Symlink farm manager which takes distinct packages of software and/or data located in separate directories on the filesystem, and makes them appear to be installed in the same place.
- vagrant: Tool for building complete development environments
- crontab: ‘Crontab’ Command on Linux works in the system’s background.
- accelerate: A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
- Activemq: ActiveMQ is most commonly deployed as a standalone process
- apscheduler: Advanced Python Scheduler (APScheduler) is a Python library that lets you schedule your Python code to be executed later, either just once or periodically
- arize-phoenix: ML Observability in a Notebook - Uncover Insights, Surface Problems, Monitor, and Fine Tune your Generative LLM, CV and Tabular Models
- argparse: Write user-friendly command-line interfaces
- Beanie: Asynchronous Python object-document mapper (ODM) for MongoDB
- beautifulsoup: Pull data out of HTML and XML files
- bert-as-a-service: Generate BERT Embeddings for production
- black: Opiniated code formatter for python code
- BLOOM: The World’s Largest Open Multilingual Language Model
- bokeh: Bokeh is a Python library for creating interactive visualizations for modern web browsers
- boto/boto3: Control AWS service with pure python code
- camelot: Extract tables from PDF files
- Celery: Task queues are used as a mechanism to distribute work across threads or machines.
- collections: Specialized container datatypes
- conda: Package, dependency and environment management
- concurrent.futures:Launching parallel tasks
- chime: Python sound notifications made easy.
- dabl: Learning comes from comparing finished products and picking the better one
- dask: Scale the Python tools you love
- datetime: Supplies classes for manipulating dates and times
- deepctr: Deep-learning based CTR models
- deepspeed: Deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
- deep-translatot: A flexible FREE and UNLIMITED tool to translate between different languages in a simple way using multiple translators.
- django: High-level Python Web framework
- djongo: Django and MongoDB database connector
- dlib: A toolkit for making real world machine learning and data analysis applications in C++
- doctest: Test interactive Python examples
- docx2txt: A pure python-based utility to extract text and images from docx files
- DPO Trainer: TRL supports the DPO Trainer for training language models from preference data, as described in the paper Direct Preference Optimization
- DSPy: The framework for programming—not prompting—foundation models
- Dynaconf: a dynamic configuration for Python applications
- einops: Flexible and powerful tensor operations for readable and reliable code.
- Embedding projector:offers three commonly used methods of data dimensionality reduction, which allow easier visualization of complex data: PCA, t-SNE and custom linear projections
- factscore: automatic evaluation metric for factual precision in long-form text generation
- FAISS:A library for efficient similarity search and clustering of dense vectors
- fastai: fastai makes deep learning with PyTorch faster, more accurate, and easier
- fastapi: FastAPI framework, high performance, easy to learn, fast to code, ready for production
- fasttext: Library for efficient text classification and representation learning
- faster-whisper: Reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models.
- finetune: Scikit-learn style model finetuning for NLP
- flash-attn: Flash Attention: Fast and Memory-Efficient Exact Attention
- flask: Lightweight WSGI web application framework
- flask-restplus: Fully featured framework for fast, easy and documented API development with Flask
- Flower: Real-time monitoring using Celery Events
- fairscale: PyTorch extensions for high performance and large scale training.
- fugue: SQL for Pandas, Spark, and Dask DataFrames
- gdal: GDAL: Geospatial Data Abstraction Library
- gensim: Topic modelling, document indexing and similarity retrieval with large corpora.
- gpt-index: Central interface to connect your LLM’s with external data.
- gpt3-simple-primer : Simple GPT-3 primer using openai.
- gspread: Python library to interact with Google Sheets
- gunicorn: Production web server for Flask, Django apps
- h2oGPT: The world's best open source GPT
- hugging face: Build, train and deploy state of the art models powered by the reference open source in machine learning
- Haystack: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data.
- hungabunga: HungaBunga: Brute-Force all sklearn models with all parameters using .fit .predict!
- hydra: Hydra is an open-source Python framework that simplifies the development of research and other complex applications.
- implicit: Fast Python Collaborative Filtering for Implicit Feedback Datasets
- interpret: Fit interpretable models. Explain blackbox machine learning.
- ipython: IPython: Productive Interactive Computing
- itertools: Functions creating iterators for efficient looping
- json: Read and write JSON files
- jupyter: Jupyter notebooks
- jupyterlab: An extensible environment for interactive and reproducible computing, based on the Jupyter Notebook and Architecture
- kedro: A Python framework for creating reproducible, maintainable and modular data science code
- keras: High-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.
- langchain: provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications
- LangSmith: LangSmith is a platform for building production-grade LLM applications.
- libffm: A Library for Field-aware Factorization Machines
- libfm: Factorization Machine Library
- lightfm: A Python implementation of LightFM, a hybrid recommendation algorithm.
- lime: Local Interpretable Model-Agnostic Explanations for machine learning classifiers
- LlamaIndex: Provides a central interface to connect your LLM’s with external data.
- loguru: Loguru is a library which aims to bring enjoyable logging in Python.
- lora: LoRA for Efficient Stable Diffusion Fine-Tuning
- magic-wormhole:This package provides a library and a command-line tool named wormhole, which makes it possible to get arbitrary-sized files and directories (or short pieces of text) from one computer to another
- Manim: Animation engine for explanatory math videos
- matchzoo: MatchZoo is a toolkit for text matching
- matplotlib: Matplotlib strives to produce publication quality 2D graphics
- memory-profiler: A module for monitoring memory usage of a python program
- Modin : Provide an effortless way to speed up your pandas notebooks, scripts, and libraries.
- mongoengine: MongoEngine is a Python Object-Document Mapper for working with MongoDB.
- more_itertools: More routines for operating on iterables, beyond itertools
- multiprocessing-logging: Logger for multiprocessing applications
- mypy: optional static type checker for Python that aims to combine the benefits of dynamic
- newspaper: Simplified python article discovery & extraction.
- nlopt: Library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization
- nltk: Natural Language Toolkit
- netron: Visualizer for neural network, deep learning and machine learning models
- nats: Allows such data exchange, segmented in the form of messages. We c
- numpy: NumPy is the fundamental package for array computing with Python.
- nvitop: An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
- openai : Provides convenient access to the OpenAI API from applications written in the Python language
- openai-playground: Allows users to explore and experiment with OpenAI's artificial intelligence models,
- OpenAPI: OpenAPI Specification provides a formal standard for describing HTTP APIs.
- opencv: Wrapper package for OpenCV python bindings.
- Optimum: 🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
- pandasql: Allows you to query pandas DataFrames using SQL syntax
- pandera: A Statistical Data Testing Toolkit
- pandarallel: An easy to use library to speed up computation (by parallelizing on multi CPUs) with pandas.
- pandas: Powerful data structures for data analysis, time series, and statistics
- pandas-profiling - Provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution
- Panel: The powerful data exploration & web app framework for Python
- patsy : Describing statistical models in Python
- pdf2image: A wrapper around the pdftoppm and pdftocairo command line tools to convert PDF to a PIL Image list.
- PEFT : Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware
- petals: Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
- pillow: Python Imaging Library (Fork)
- pipenv: Pipenv is a tool that aims to bring the best of all packaging worlds (bundler, composer, npm, cargo, yarn, etc.) to the Python world.
- pipreqs: Generate pip
requirements.txt
file based on imports of any project. Looking for maintainers to move this project forward. - plotly: An open-source, interactive graphing library for Python
- poetry: Python packaging and dependency management made easy
- pre-commit: A framework for managing and maintaining multi-language pre-commit hooks.
- presidio: Context aware, pluggable and customizable data protection and de-identification SDK for text and images
- Promptify: Prompt Engineering, Solve NLP Problems with LLM's & Easily generate different NLP Task prompts for popular generative models like GPT, PaLM, and more with Promptify
- prophet: Microframework for analyzing financial markets.
- pyaudio: Cross-platform audio I/O with PortAudio
- pydash: The kitchen sink of Python utility libraries for doing "stuff" in a functional way
- Pylint: It's not just a linter that annoys you!
- Pipe: Write clean python Code
- pydantic: Most widely used data validation library for Python.
- PydanticAI: A Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.
- PyMuPDF: PyMuPDF is a Python binding for MuPDF – a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc
- pymongo: Python driver for MongoDB
- pymysql: Pure Python MySQL Driver
- pyod: PyOD is the most comprehensive and scalable Python library for detecting outlying objects in multivariate data.
- pyodbc: pyodbc is an open source Python module that makes accessing ODBC databases simple
- pypdf2: PDF toolkit
- pyppeteer: Headless chrome/chromium automation library (unofficial port of puppeteer)
- pyspark: Apache Spark Python API
- pyttsx3: text-to-speech conversion library in Python
- pytest: pytest: simple powerful testing with Python
- python-dotenv: Add .env support to your django/flask apps in development and deployments
- pytorch: Open source machine learning framework
- pytorch-transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch
- pyyaml: YAML 1.1 parser
- QLoRA: efficient Finetuning of Quantized LLMs
- RabbitMQ: A queue in RabbitMQ is an ordered collection of messages
- Ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
- rasterio: Reads and writes GeoTIFF formats and provides a Python API based on N-D arrays
- RapidAPI:To discover and connect to thousands of APIs
- Ray: Effortlessly scale your most complex workloads
- re: Regular expression matching operations
- requests: HTTP library for Python
- schedule: Python job scheduling for humans. Run Python functions (or any other callable) periodically using a friendly syntax
- scikit-image: Collection of algorithms for image processing
- scikit-learn: Tools for data mining and data analysis and machine learning in Python
- scikit-surprise: Python RecommendatIon System Engine
- scrapy: Framework for extracting the data you need from websites
- seaborn: Data visualization library based on matplotlib.
- selenium: Provides a simple API to write functional/acceptance tests using Selenium WebDriver
- Sentence-Transformers: Python framework for state-of-the-art sentence, text and image embeddings
- sentry-sdk: Sentry's Python SDK enables automatic reporting of errors and performance data in your application.
- Supervised Fine-tuning Trainer: This class is a wrapper around the
transformers.Trainer
class and inherits all of its attributes and methods. The trainer takes care of properly initializing the PeftModel in case a user passes aPeftConfig
object. - shap: Explain the output of any machine learning model
- shutil: Offers a number of high-level operations on files and collections of files
- sketch: understands the context of your data, greatly improving the relevance of suggestions
- spacy: Library for advanced Natural Language Processing in Python
- SpanMarker: SpanMarker for Named Entity Recognition
- sqlalchemy: Python SQL toolkit
- StableLM: StableLM: Stability AI Language Models
- Stanford Alpaca: Alpaca: A Strong, Replicable Instruction-Following Model
- Supervised Fine-tuning Trainer: Involves adapting a pre-trained Language Model (LLM) to a specific downstream task using labeled data
- sympy: Python library for symbolic mathematics
- tabulapy: Python wrapper of tabula-java, which can read table of PDF
- taichi: Productive, portable, and performant GPU programming in Python.
- tensorflow: Core open source library to develop and train ML models
- Tensorflow Playground : A Neural Network Playground
- tika: An interface that provides the facility to extract content and metadata from any type of document
- tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models.
- triton: Development repository for the Triton language and compiler
- txtai: Build AI-powered semantic search applications
- tqdm: Displays progress bar for list iterations
- tracemalloc : Trace memory allocations
- trafilatura: A Python package & command-line tool to gather text on the Web
- Trainer: Provides an API for feature-complete training in PyTorch for most standard use cases
- Trio: a friendly Python library for async concurrency and I/O
- urllib: Collects several modules for working with URLs
- vmap: vmap is the vectorizing map; vmap(func) returns a new function that maps func over some dimension of the inputs.
- vectorhub: Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)
- Vertex AI: Train and deploy ML models
- vaex: a partial Pandas replacement that uses lazy evaluation and memory mapping to allow developers to work with large datasets on standard machines
- vllm: A high-throughput and memory-efficient inference and serving engine for LLMs
- wandb: MLOps platform helps AI developers streamline their ML workflow from end-to-end.
- Websocket: real-time, event-driven communication between clients and servers
- Whisper: Automatic speech recognition model trained on 680,000 hours of multilingual data collected from the web.
- whisper-jax: optimised implementation of the Whisper model by OpenAI
- xgboost: Distributed gradient boosting library
- xlearn: High performance, easy-to-use, and scalable machine learning package
- xlrd: Extract data from Excel spreadsheets
- yaml: YAML parser and emitter for Python