Supported Python packages

The following Python packages are pre-installed in Jori, allowing users to import and utilize them in their code:

  1. PySpark (version 3.3.1):

    • PySpark is the Python API for Apache Spark, a distributed computing framework for big data processing.

    • It provides a simple and expressive programming model for processing large datasets in parallel across clusters.

    • PySpark supports various data processing tasks, including data manipulation, machine learning, and graph processing.

  2. Matplotlib:

    • Matplotlib is a plotting library for creating static, animated, and interactive visualizations in Python.

    • It provides a wide range of plotting functions for creating line plots, scatter plots, bar charts, histograms, and more.

    • Matplotlib allows customization of plot elements, such as colors, labels, titles, and axes.

  3. Pydeck:

    • Pydeck is a Python library for creating interactive visualizations using deck.gl, a WebGL-powered framework for visual exploratory data analysis.

    • It allows users to create interactive maps, 3D visualizations, and layered visualizations using a declarative API.

  4. Scikit-learn:

    • Scikit-learn is a machine learning library for Python, providing a wide range of supervised and unsupervised learning algorithms.

    • It includes tools for data preprocessing, model selection, evaluation metrics, and feature extraction.

    • Scikit-learn is widely used for tasks such as classification, regression, clustering, and dimensionality reduction.

  5. PyTorch:

    • PyTorch is an open-source machine learning library for Python, primarily used for developing and training deep neural networks.

    • It provides a dynamic computational graph and supports both eager execution and graph-based execution.

    • PyTorch is known for its flexibility, ease of use, and strong GPU acceleration.

  6. TensorFlow:

    • TensorFlow is an open-source machine learning framework developed by Google.

    • It provides a comprehensive ecosystem for building and deploying machine learning models, with a focus on deep learning.

    • TensorFlow offers high-level APIs, such as Keras, for easy model building and supports distributed training across multiple devices.

  7. Plotly:

    • Plotly is a web-based plotting library for creating interactive and publication-quality visualizations.

    • It supports a wide range of chart types, including line charts, scatter plots, heatmaps, 3D plots, and more.

    • Plotly allows users to create interactive and responsive visualizations that can be easily shared and embedded in web applications.

  8. DeepSurv:

    • DeepSurv is a Python package for survival analysis using deep learning.

    • It implements deep survival models, such as DeepSurv and DeepHit, for modeling time-to-event data.

    • DeepSurv allows for the incorporation of complex non-linear relationships and high-dimensional covariates in survival analysis.

  9. Autoprognosis:

    • Autoprognosis is a Python package for automated machine learning (AutoML) in prognostic modeling.

    • It provides a high-level API for automatically building and evaluating prognostic models, such as survival models and time-to-event models.

    • Autoprognosis simplifies the process of model selection, hyperparameter tuning, and model evaluation in prognostic modeling tasks.

  10. Scikit-survival:

    • Scikit-survival is a Python module for survival analysis built on top of scikit-learn.

    • It provides a set of tools and algorithms for analyzing time-to-event data, including survival estimators, evaluation metrics, and regression models.

    • Scikit-survival integrates well with scikit-learn and offers a consistent API for survival analysis tasks.

Users can import these packages in their Python code and leverage their functionalities for various data processing, machine learning, visualization, and survival analysis tasks.

Last updated