Using Inwire

This guide covers day-to-day workflows in Inwire — from registering models to running training jobs and setting up RAG pipelines. By the end, you'll understand how the major modules work together in typical ML workflows.

Overview of Common Workflows

Inwire supports several interconnected workflows:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         Typical ML Workflow in Inwire                        │
└─────────────────────────────────────────────────────────────────────────────┘

   ┌─────────┐      ┌─────────┐      ┌─────────┐      ┌─────────┐
   │  Data   │  →   │ Model   │  →   │ Deploy  │  →   │ Monitor │
   │  Prep   │      │Training │      │         │      │         │
   └─────────┘      └─────────┘      └─────────┘      └─────────┘
        │                │                │                │
        ▼                ▼                ▼                ▼
   ┌─────────┐      ┌─────────┐      ┌─────────┐      ┌─────────┐
   │ Synthex │      │ Model   │      │ModelOps │      │ Observe │
   │         │      │Training │      │         │      │         │
   └─────────┘      └─────────┘      └─────────┘      └─────────┘

Working with Data in Synthex

Synthex is the "data twin" of Model Training — it manages all data-related operations including:

Dataset ingestion and profiling
Data cleaning and transformation
Synthetic data generation
Data quality evaluation

Importing a Dataset

Navigate to Synthex → Datasets
Click Import Dataset
Choose your source:

- Upload File — CSV, Parquet, JSONL

- Cloud Storage — S3, GCS, Azure Blob

- Database — Direct query (if configured)

Configure import options:

| Option | Description |

|--------|-------------|

| Name | Descriptive dataset name |

| Description | What this data represents |

| Source Type | Tabular, Text, Time Series, etc. |

| Tags | For organization and filtering |

Click Import

The system will automatically profile your data and detect the schema.

Creating a Data Profile

Data profiles define the schema and characteristics of your dataset:

Go to Synthex → Data Profiles
Click Create Profile
Define columns:

```

Column: customer_id

Type: String

Constraints: Unique, Not Null

Column: transaction_amount

Type: Float

Constraints: Min: 0, Max: 100000

Column: is_fraud

Type: Boolean

Distribution: 1% True, 99% False

```

Save the profile

Generating Synthetic Data

For detailed synthetic data generation, see the Synthex User Guide.

Quick generation workflow:

Go to Synthex → Generate Data
Select a data profile or source dataset
Choose generation method:

- Statistical — Fast, preserves distributions

- GAN-based — Higher quality, slower

- LLM-based — For text and complex patterns

Configure options (record count, privacy level)
Click Generate

Running Training Jobs

Creating an Experiment

Experiments organize related training runs:

Go to Model Training → Experiments
Click New Experiment
Fill in details:

| Field | Example |

|-------|---------|

| Name | fraud-detector-2024 |

| Description | Binary classifier for transaction fraud |

| Tags | fraud, classification, production |

Click Create

Starting a Training Job

Use the Training Wizard for guided setup:

Step 1: Select Model

Choose your model type and framework:

Framework: PyTorch, TensorFlow, scikit-learn, XGBoost
Model Type: Classification, Regression, NLP, etc.
Base Model: Pre-trained model (optional)

Step 2: Select Dataset

This is where Synthex integrates with Model Training:

Click Select Dataset
Browse available datasets:

- Raw Datasets — Original uploaded data

- Cleaned Datasets — Processed versions

- Synthetic/Augmented — Generated data

Select dataset version
Optionally select a Data Recipe for transformations

> Note: The UI queries Synthex to show available datasets. Your selection is recorded for reproducibility.

Step 3: Configure Training

Set hyperparameters and training options:

# Example configuration
learning_rate: 0.001
batch_size: 32
epochs: 100
optimizer: adam
early_stopping:
  patience: 10
  metric: val_loss

Step 4: Select Infrastructure

Choose compute resources:

CPU Only — For small models and testing
Single GPU — Most common for development
Multi-GPU — For large models and production runs

Step 5: Review and Launch

Review all settings
Click Start Training
Monitor progress in real-time

Monitoring Training Progress

The training dashboard shows:

Real-time Metrics — Loss, accuracy, custom metrics
Resource Usage — GPU utilization, memory
Logs — Training output and errors
Artifacts — Checkpoints, outputs

┌────────────────────────────────────────────────────────────┐
│  Training: fraud-detector-run-42                           │
├────────────────────────────────────────────────────────────┤
│  Status: Running (Epoch 45/100)                            │
│                                                            │
│  ┌──────────────────────────────┐  Metrics                 │
│  │ Loss                          │  ────────               │
│  │  ▁▂▃▄▅▄▃▂▁▁▁                 │  Train Loss: 0.023      │
│  │                              │  Val Loss: 0.031        │
│  └──────────────────────────────┘  Accuracy: 98.2%        │
│                                                            │
│  GPU: 78%   Memory: 12.4/16 GB   ETA: 23 min              │
└────────────────────────────────────────────────────────────┘

Comparing Experiments

To compare multiple runs:

Go to Model Training → Experiments
Select experiments to compare (checkbox)
Click Compare
View side-by-side:

- Configuration differences

- Metric comparisons

- Training curves

Deploying Models with ModelOps

Registering a Model

After training completes:

Go to Model Training → Experiments → [Your Experiment]
Select the best run
Click Register Model
Fill in model details:

| Field | Example |

|-------|---------|

| Name | fraud-detector |

| Version | 1.0.0 |

| Description | Production fraud detection model |

| Tags | production, fraud, v1 |

Click Register

The model is now in the Model Registry.

Creating a Deployment

Go to ModelOps → Deployments
Click New Deployment
Configure:

| Setting | Description |

|---------|-------------|

| Model | Select from registry |

| Version | Model version to deploy |

| Environment | dev, staging, or prod |

| Replicas | Number of instances |

| Resources | CPU/Memory/GPU allocation |

Click Deploy

Deployment Strategies

Inwire supports several deployment strategies:

Strategy	Description	Use Case
Rolling	Gradual replacement	Low-risk updates
Blue/Green	Instant switch	Zero-downtime releases
Canary	Partial traffic shift	Testing in production
Shadow	Mirror traffic	Validation without impact

Monitoring Deployments

The deployment dashboard shows:

Health Status — Liveness and readiness
Request Metrics — Latency, throughput, errors
Resource Usage — CPU, memory, GPU
Logs — Application and system logs

Building RAG Pipelines

Understanding RAG in Inwire

RAG (Retrieval-Augmented Generation) pipelines combine:

Document retrieval from a knowledge base
LLM generation using retrieved context

Creating a RAG Pipeline

Go to RAG → Pipelines
Click Create Pipeline
Configure stages:

Stage 1: Data Sources

Add documents to index:

File Upload — PDF, Markdown, Text
Web Crawler — URLs to scrape
Database — SQL queries
API — External data sources

Stage 2: Processing

Configure document processing:

Chunking — Split documents into segments
Embedding — Generate vector embeddings
Metadata — Extract and store metadata

Stage 3: Retrieval

Set up search configuration:

Vector Search — Similarity-based retrieval
Keyword Search — Traditional text search
Hybrid — Combined approach

Stage 4: Generation

Configure LLM generation:

Model — GPT-4, Claude, Llama, etc.
Prompt Template — System and user prompts
Parameters — Temperature, max tokens

Testing RAG Pipelines

Go to your pipeline
Click Test
Enter a query
Review:

- Retrieved documents

- Generated response

- Confidence scores

RAG and Synthex Integration

Synthex can generate test data for RAG pipelines:

Golden Evaluation Sets — Q&A pairs for measuring RAG quality
Stress Test Data — Large query volumes
Edge Cases — Difficult or adversarial queries

See the Synthex User Guide for details on generating RAG evaluation data.

Working with Real-time Data (Stream)

Understanding Stream

The Stream service handles real-time data:

Ingestion — Receive data from various sources
Processing — Transform and enrich in real-time
Routing — Send to destinations (storage, ML models)

Creating a Stream Pipeline

Go to Stream → Pipelines
Click Create Pipeline
Configure:

- Source — Kafka, webhook, database CDC

- Transformations — Filter, map, aggregate

- Sink — Storage, model inference, alerts

Monitoring Streams

View real-time metrics:

Throughput — Messages per second
Latency — Processing time
Errors — Failed messages and retries

Prompt Engineering with PromptScope

Testing Prompts

Go to PromptScope → Playground
Enter your prompt
Select model and parameters
Run and iterate

Managing Prompt Templates

Go to PromptScope → Templates
Create reusable prompt templates
Version and compare templates
Deploy templates to production

A/B Testing Prompts

Create variant prompts
Set up an A/B test
Route traffic to variants
Analyze results

Reproducibility and Lineage

Understanding Data Lineage

Inwire tracks the complete lineage of your ML artifacts:

Raw Dataset → Cleaned Dataset → Synthetic Dataset → Training Run → Model → Deployment
     ↓              ↓                  ↓                 ↓          ↓          ↓
[Synthex]      [Recipe]         [Synthex Config]    [Experiment]  [Registry] [ModelOps]

Viewing Lineage

Go to any artifact (dataset, model, deployment)
Click Lineage or History
View the complete chain of operations

Reproducing Results

To reproduce a training run:

Go to the experiment run
View Configuration — All parameters used
View Data — Exact dataset version and recipe
Click Reproduce to create a new run with identical settings

Best Practices

Data Management

Version everything — Datasets, recipes, and configs
Use meaningful names — Include date, purpose, version
Document assumptions — Note data quality issues
Review data quality — Before and after transformations

Training Workflows

Start small — Test with subset before full training
Track everything — Log all experiments, even failures
Compare systematically — Use experiment comparison tools
Automate validation — Set up automated quality checks

Deployment Safety

Use staging — Always test in staging first
Monitor closely — Especially after new deployments
Have rollback plans — Know how to revert quickly
Set up alerts — For latency, errors, drift

Common Workflows Summary

Workflow	Services Used	Key Steps
Train a classifier	Synthex → Model Training	Import data → Create profile → Train → Evaluate
Deploy a model	Model Training → ModelOps	Register model → Create deployment → Monitor
Build RAG pipeline	RAG + Synthex	Add documents → Configure retrieval → Test with synthetic queries
Generate test data	Synthex	Create profile → Generate synthetic → Export

Next Steps

For deeper dives into specific services:

Synthex User Guide — Comprehensive synthetic data guide
Backend Services Overview — All service documentation

For help with specific tasks, use the in-app help or return to the User Guide.