The Lucd Enterprise AI Data Science Platform is a highly secure, scalable, open and flexible platform for persisting an fusing large and numerous datasets and training AI models for production against those datasets. The Lucd platform is an end to end platform that can be deployed in public cloud environments, on premise on bare metal hardware, or the Lucd multi-tenant PaaS can be directly accessed. The platform consists of:
- A scalable open data ingest capability
- A petabyte scale unified data space data repository
- 3-D Visualization and Exploration
- An Exploratory Data Analysis Rest Service
- A Kubernetes environment to train PyTorch and TensorFlow models
- NLP Word Embedding and Explainable AI Assets
- Model results visualization and exporting to internal or external serving capability
The Lucd Python Client is a Python package created to:
- Provide a programming interface to the Lucd platform. It accesses and uses the same REST Service that the Lucd 3D UI leverages.
- Allow custom operations that are not or not yet implemented in the 3D UI
- Allow data scientists to test custom developed models locally on subsets of Data in the Lucd Unified Data Space prior to training at scale in the Lucd platform
The Lucd Python Client is created per Python Packaging Projects approaches. In the future, this package will be available via pip install and/or conda install. But, for now it is a local Package distributed by Lucd.
- As people who work in Python know, there are many ways to work in Python. Maybe people (including Lucd developers) use PyCharm. Others use VIM or other IDEs. What environment you choose is your decision. These notes are geared toward Anaconda3 and Jupyter Notebooks. But, that is not a requirement.
- You can decide to user your base environment or create a dedicated environment to do your Lucd Python Client work.
- Whatever environment you choose, you should have the Python running in that environment as 3.6 (3.6.5 specifically). The Lucd platform leverages Python 3.6, so in order to ensure your work is compatible, your environment should run Python 3.6. If you are starting from scratch with a fresh Anaconda install, the following Anaconda executable installs Python 3.6.5 as your base environment, windows 64 bit https://repo.anaconda.com/archive/Anaconda3-5.2.0-Windows-x86_64.exe. For Mac or Linux or 32 bit reference the appropriate Anaconda3 from: https://repo.anaconda.com/archive/. But, you can definitely install the most recent Anaconda executable (with Python 3.7) and create your own 3.6.5 environment. And, again, using Anaconda is not a requirement, you can use PyCharm, other IDEs, or compatible Python environments if you prefer.
- There are many packages and contingincies in python environments. You should ensure that at least the following packages are included in your environment:
- Dask 2.13 or greater
- Dask ML
- Dask Distributed
- Tensorflow > 2.0
when running setup.py from your Lucd Python Package, it has been noted that sometimes two folders do not make it into your Site Packages. Go to your local site packages. This is usually someplace like:
Within that folder you should see the Lucd folder, within the Lucd folder there is an Operation folder. In Operation, there should be the following:
- a folder called pycache
- a folder called lib
- a folder called eda
- a file called init.py
If lib and eda are not there, go to the location of your Lucd Python Package:
In there, you will see the lib and the eda folders. Copy them both into the Operations folder. I.e. into: