The Lucd Modeling Framework (LMF) enables developers to build and interface custom AI models with the Lucd JedAI platform for streamlined management, experimentation, and training using data and parameters established in the JedAI Unity client (or simply, Unity client). The framework supports python-based AI models built with TensorFlow, PyTorch, Scikit-Learn, XGBoost (dask module), and federated models. LMF’s python libraries support the following tasks: - accessing Lucd virtual datasets (VDSes) for model training and evaluation, - analyzing and reporting model performance metrics (e.g., with confusion matrices, ROC curves), - storing structures representing trained models and training checkpoints.
Additionally, LMF supports distributed model training using the Horovod framework (https://horovod.ai/). In version 6.7.0, LMF support for Horovod has only been tested using PyTorch models.
Model Development Approaches¶
LMF provides flexibility in the level of effort and control needed for preparing models for Lucd. The two approaches include the full and compact model approaches; their differences are illustrated in Figure 1.
Figure 1. Conceptual illustration of full and compact model approaches.
Full Model Approach¶
In the full model approach, a developer creates some AI model and manually uses LMF python libraries to complete the model training workflow (e.g., train, validate, holdout data testing, store results). This enables complete flexibility for more advanced use cases which might include designing complex or experimental training loops, advanced performance analysis, custom model compression, etc. Full models are implemented using normal python scripts. Further details are in the Developing Full Models section of this documentation.
Compact Model Approach¶
The compact model approach enables a developer to focus most if not all effort on defining an AI model, leaving other workflow tasks like holdout data testing and storage of performance results for the LMF to do automatically behind the scenes. In the case of TensorFlow, the developer does not even need to write training logic. The major benefits of the compact model approach include (1) significantly less coding effort and (2) potential reduction of errors and/or inconsistencies in writing boilerplate performance-testing logic. These benefits are especially useful for formatting models for multi-run experiments such as k-fold cross validation and learning curves (which will be introduced in an upcoming LMF release). Further details about compact modeling are in Developing Compact Models.
Distributed Model Training¶
To use distributed model training, all that is required is a developer be familiar with how to use the Horovod python library to distribute their model. More details are covered in Distributed Model Training.
Federated Machine Learning allows for models to be built and trained across distinct remote systems (known as federates). This capability is incredibly useful when you either don’t want to or can’t move data across systems, as federated machine learning moves the model across the systems for training instead of moving the data around. Support for Federated machine learning is enabled. Further details about federated modeling are in Developing Federated Models
Notable Framework Capabilities¶
The LMF consists of an evolving set of capabilities. The following subsections describe notable modeling capabilities supported as of release 6.6.0.
TensorFlow Estimator-Based Modeling¶
TensorFlow supports AI modeling using either low-level APIs or easier-to-use high-level Estimator APIs. The LMF is designed to support Estimator-based model development. Keras may be used to create models, especially for enabling more customization. However, such models must be converted to Estimators for LMF and the broader Lucd JedAI platform to manage them appropriately. See for following link for an introduction to TensorFlow Estimators, https://www.tensorflow.org/guide/estimator.
Various Feature Types¶
For TensorFlow modeling, all dataset feature column types are supported (see https://www.tensorflow.org/guide/feature_columns), enabling support for a broad range of numeric and categorical features. Regarding categorical features, the domain of such a feature must be known at training time. For example, if you choose to use a feature car_make as a categorical feature, you must know all the possible makes when you write your model. This requirement will be removed in a future release. Also, the conversion of non-numerical data to numerical data (e.g., for encoding label/target values) based on a scan of the entire dataset is not supported in the current release. However, to help with this, data value replacement operations are supported in the Unity client.
For TensorFlow modeling, label types are assumed to be TensorFlow int32.
For TensorFlow and PyTorch modeling, LMF supports the use of embedding data, e.g., word2vec for representing free text. For PyTorch, the TorchText library is supported, but n-grams are not supported in the current release.
Important Note: Currently, when using text input, only the text/embedding input is allowed as a feature, enabling conventional text classification. Future releases will enable the use of multiple feature inputs alongside text data.
For TensorFlow and PyTorch modeling, use of image data (i.e., pixel values) as model input is supported.
Scikit-learn models and Scikit-learn pipelines are also supported. The use of sklearn.preprocessing.FunctionTransformer and other custom transformers within pipelines are not supported.
Distributed XGBoost using Dask¶
Distributed training of XGBoost models using the Dask parallel data analytics framework is supported. Current versions of XGBoost (1.3.1) include a module natively inside of the XGBoost library (the dask-xgboost project was migrated). See the following link for more information, https://xgboost.readthedocs.io/en/latest/tutorials/dask.html.
Support for TensorFlow and PyTorch distributed training is under development.
The Lucd modeling framework supports the following languages and machine learning -related libraries:
- Python v3.6.5
- TensorFlow (for Python) v2.1
- PyTorch v1.6.0 - TorchText
- Dask v2021.1.0 - XGBoost v1.3.1 - Numpy v1.16.4 - Scikit-learn v0.19.2 - Pandas v0.25.1
While this documentation introduces all the core components and best practices for developing AI models for the Lucd JedAI platform, there is rarely a replacement for sample code. The Lucd Model Shop provides a wide range of code (prepared by Lucd engineers) to help developers get started with preparing AI models. In the future, the Lucd Model Shop will also allow for the larger Lucd developer community to share their code, further helping others with their AI goals.
Python API Documentation¶
The LMF Python API documentation can be found in the following Lucd GitLab Pages site, https://lucd.pages.lucd.ai/product-development/lucd-eda-rest/.
Preparing Models Using the Lucd Modeling Framework¶
The following documentation contains further details and examples for developing AI models for Lucd.
- Developing Compact Models - Developing Full Models - Developing Federated Models - Working with Data and Performance Analysis - The Lucd Model Shop
An important note for developing PyTorch models in Lucd is that before saving the model, the “eval” mode must be activiated. See the following link for more details, https://pytorch.org/tutorials/beginner/saving_loading_models.html?highlight=eval.