Getting Started
Installation
Install the latest version via pip...
pip install be-great
... or download the source code from GitHub
git clone https://github.com/kathrinse/be_great.git
Requirements
GReaT requires Python 3.9 (or higher) and the following packages:
- datasets >= 2.5.2
- numpy >= 1.23.1
- pandas >= 1.4.4
- scikit_learn >= 1.1.1
- torch >= 1.10.2
- tqdm >= 4.64.1
- transformers >= 4.22.1
Quickstart
In the example below, we show how the GReaT approach is used to generate synthetic tabular data for the California Housing dataset.
from be_great import GReaT
from sklearn.datasets import fetch_california_housing
data = fetch_california_housing(as_frame=True).frame
model = GReaT(llm='distilgpt2', epochs=50)
model.fit(data)
synthetic_data = model.sample(n_samples=100)
See Examples to find more details.