GenAI Stack (old)
v0.1.0
v0.1.0
  • Getting Started
    • 📚Introduction
    • 🚀Quickstart with colab
    • 📘Default Data Types
    • 🪛Installation
  • Components
    • ✨Introduction
    • 🚜Data Extraction and Loading
      • 🔥Quickstart
      • 📖Advanced Usage
    • 🔮Vector Database
      • 🔥Quickstart
      • 📦Chromadb
      • 📦Weaviate
      • 📖Advanced Usage
    • 📤Retrieval
    • 🦄LLMs
      • OpenAI
      • GPT4All
      • Custom Model
      • 📖Advanced Usage
  • Example Use Cases
    • 💬Chat on PDF
    • âš¡Chat on Webpage
    • 📜Chat on PDF with UI
  • 🧑CONTRIBUTING.md
Powered by GitBook
On this page
  • Explanation
  • Supported Data Loaders:
  1. Components

Data Extraction and Loading

PreviousIntroductionNextQuickstart

Last updated 1 year ago

Explanation

Data extraction and loading (ETL) is the process of sourcing data from diverse origins, transforming it for usability, and loading it into a target system.

ETL stands for Extract, Transform and Load. These are the three main steps to convert/move from a data source to a target destination.

Here we are getting the documents from various different sources (Extract) and converting it into embeddings (transform) and finally loading it to a vector database (Load) . Hence this ETL process achieves the data loading part from a source to a vectordb destination.

Our workflow diagram:

Supported Data Loaders:

Currently we support three ETL platforms , they are:

  • Airbyte

  • Llama Hub

  • Langchain

You can use any one of these loaders to carry out the ETL process.

🚜
Data Loaders Architecture Diagram