Using Inwire
This guide covers day-to-day workflows in Inwire — from registering models to running training jobs and setting up RAG pipelines. By the end, you'll understand how the major modules work together in typical ML workflows.
Overview of Common Workflows
Inwire supports several interconnected workflows:
┌─────────────────────────────────────────────────────────────────────────────┐
│ Typical ML Workflow in Inwire │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Data │ → │ Model │ → │ Deploy │ → │ Monitor │
│ Prep │ │Training │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Synthex │ │ Model │ │ModelOps │ │ Observe │
│ │ │Training │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
Working with Data in Synthex
Synthex is the "data twin" of Model Training — it manages all data-related operations including:
- Dataset ingestion and profiling
- Data cleaning and transformation
- Synthetic data generation
- Data quality evaluation
Importing a Dataset
- Navigate to Synthex → Datasets
- Click Import Dataset
- Choose your source:
- Upload File — CSV, Parquet, JSONL
- Cloud Storage — S3, GCS, Azure Blob
- Database — Direct query (if configured)
- Configure import options:
| Option | Description |
|--------|-------------|
| Name | Descriptive dataset name |
| Description | What this data represents |
| Source Type | Tabular, Text, Time Series, etc. |
| Tags | For organization and filtering |
- Click Import
The system will automatically profile your data and detect the schema.
Creating a Data Profile
Data profiles define the schema and characteristics of your dataset:
- Go to Synthex → Data Profiles
- Click Create Profile
- Define columns:
```
Column: customer_id
Type: String
Constraints: Unique, Not Null
Column: transaction_amount
Type: Float
Constraints: Min: 0, Max: 100000
Column: is_fraud
Type: Boolean
Distribution: 1% True, 99% False
```
- Save the profile
Generating Synthetic Data
For detailed synthetic data generation, see the Synthex User Guide.
Quick generation workflow:
- Go to Synthex → Generate Data
- Select a data profile or source dataset
- Choose generation method:
- Statistical — Fast, preserves distributions
- GAN-based — Higher quality, slower
- LLM-based — For text and complex patterns
- Configure options (record count, privacy level)
- Click Generate
Running Training Jobs
Creating an Experiment
Experiments organize related training runs:
- Go to Model Training → Experiments
- Click New Experiment
- Fill in details:
| Field | Example |
|-------|---------|
| Name | fraud-detector-2024 |
| Description | Binary classifier for transaction fraud |
| Tags | fraud, classification, production |
- Click Create
Starting a Training Job
Use the Training Wizard for guided setup:
Step 1: Select Model
Choose your model type and framework:
- Framework: PyTorch, TensorFlow, scikit-learn, XGBoost
- Model Type: Classification, Regression, NLP, etc.
- Base Model: Pre-trained model (optional)
Step 2: Select Dataset
This is where Synthex integrates with Model Training:
- Click Select Dataset
- Browse available datasets:
- Raw Datasets — Original uploaded data
- Cleaned Datasets — Processed versions
- Synthetic/Augmented — Generated data
- Select dataset version
- Optionally select a Data Recipe for transformations
> Note: The UI queries Synthex to show available datasets. Your selection is recorded for reproducibility.
Step 3: Configure Training
Set hyperparameters and training options:
# Example configuration
learning_rate: 0.001
batch_size: 32
epochs: 100
optimizer: adam
early_stopping:
patience: 10
metric: val_loss
Step 4: Select Infrastructure
Choose compute resources:
- CPU Only — For small models and testing
- Single GPU — Most common for development
- Multi-GPU — For large models and production runs
Step 5: Review and Launch
- Review all settings
- Click Start Training
- Monitor progress in real-time
Monitoring Training Progress
The training dashboard shows:
- Real-time Metrics — Loss, accuracy, custom metrics
- Resource Usage — GPU utilization, memory
- Logs — Training output and errors
- Artifacts — Checkpoints, outputs
┌────────────────────────────────────────────────────────────┐
│ Training: fraud-detector-run-42 │
├────────────────────────────────────────────────────────────┤
│ Status: Running (Epoch 45/100) │
│ │
│ ┌──────────────────────────────┐ Metrics │
│ │ Loss │ ──────── │
│ │ ▁▂▃▄▅▄▃▂▁▁▁ │ Train Loss: 0.023 │
│ │ │ Val Loss: 0.031 │
│ └──────────────────────────────┘ Accuracy: 98.2% │
│ │
│ GPU: 78% Memory: 12.4/16 GB ETA: 23 min │
└────────────────────────────────────────────────────────────┘
Comparing Experiments
To compare multiple runs:
- Go to Model Training → Experiments
- Select experiments to compare (checkbox)
- Click Compare
- View side-by-side:
- Configuration differences
- Metric comparisons
- Training curves
Deploying Models with ModelOps
Registering a Model
After training completes:
- Go to Model Training → Experiments → [Your Experiment]
- Select the best run
- Click Register Model
- Fill in model details:
| Field | Example |
|-------|---------|
| Name | fraud-detector |
| Version | 1.0.0 |
| Description | Production fraud detection model |
| Tags | production, fraud, v1 |
- Click Register
The model is now in the Model Registry.
Creating a Deployment
- Go to ModelOps → Deployments
- Click New Deployment
- Configure:
| Setting | Description |
|---------|-------------|
| Model | Select from registry |
| Version | Model version to deploy |
| Environment | dev, staging, or prod |
| Replicas | Number of instances |
| Resources | CPU/Memory/GPU allocation |
- Click Deploy
Deployment Strategies
Inwire supports several deployment strategies:
| Strategy | Description | Use Case |
|---|---|---|
| Rolling | Gradual replacement | Low-risk updates |
| Blue/Green | Instant switch | Zero-downtime releases |
| Canary | Partial traffic shift | Testing in production |
| Shadow | Mirror traffic | Validation without impact |
Monitoring Deployments
The deployment dashboard shows:
- Health Status — Liveness and readiness
- Request Metrics — Latency, throughput, errors
- Resource Usage — CPU, memory, GPU
- Logs — Application and system logs
Building RAG Pipelines
Understanding RAG in Inwire
RAG (Retrieval-Augmented Generation) pipelines combine:
- Document retrieval from a knowledge base
- LLM generation using retrieved context
Creating a RAG Pipeline
- Go to RAG → Pipelines
- Click Create Pipeline
- Configure stages:
Stage 1: Data Sources
Add documents to index:
- File Upload — PDF, Markdown, Text
- Web Crawler — URLs to scrape
- Database — SQL queries
- API — External data sources
Stage 2: Processing
Configure document processing:
- Chunking — Split documents into segments
- Embedding — Generate vector embeddings
- Metadata — Extract and store metadata
Stage 3: Retrieval
Set up search configuration:
- Vector Search — Similarity-based retrieval
- Keyword Search — Traditional text search
- Hybrid — Combined approach
Stage 4: Generation
Configure LLM generation:
- Model — GPT-4, Claude, Llama, etc.
- Prompt Template — System and user prompts
- Parameters — Temperature, max tokens
Testing RAG Pipelines
- Go to your pipeline
- Click Test
- Enter a query
- Review:
- Retrieved documents
- Generated response
- Confidence scores
RAG and Synthex Integration
Synthex can generate test data for RAG pipelines:
- Golden Evaluation Sets — Q&A pairs for measuring RAG quality
- Stress Test Data — Large query volumes
- Edge Cases — Difficult or adversarial queries
See the Synthex User Guide for details on generating RAG evaluation data.
Working with Real-time Data (Stream)
Understanding Stream
The Stream service handles real-time data:
- Ingestion — Receive data from various sources
- Processing — Transform and enrich in real-time
- Routing — Send to destinations (storage, ML models)
Creating a Stream Pipeline
- Go to Stream → Pipelines
- Click Create Pipeline
- Configure:
- Source — Kafka, webhook, database CDC
- Transformations — Filter, map, aggregate
- Sink — Storage, model inference, alerts
Monitoring Streams
View real-time metrics:
- Throughput — Messages per second
- Latency — Processing time
- Errors — Failed messages and retries
Prompt Engineering with PromptScope
Testing Prompts
- Go to PromptScope → Playground
- Enter your prompt
- Select model and parameters
- Run and iterate
Managing Prompt Templates
- Go to PromptScope → Templates
- Create reusable prompt templates
- Version and compare templates
- Deploy templates to production
A/B Testing Prompts
- Create variant prompts
- Set up an A/B test
- Route traffic to variants
- Analyze results
Reproducibility and Lineage
Understanding Data Lineage
Inwire tracks the complete lineage of your ML artifacts:
Raw Dataset → Cleaned Dataset → Synthetic Dataset → Training Run → Model → Deployment
↓ ↓ ↓ ↓ ↓ ↓
[Synthex] [Recipe] [Synthex Config] [Experiment] [Registry] [ModelOps]
Viewing Lineage
- Go to any artifact (dataset, model, deployment)
- Click Lineage or History
- View the complete chain of operations
Reproducing Results
To reproduce a training run:
- Go to the experiment run
- View Configuration — All parameters used
- View Data — Exact dataset version and recipe
- Click Reproduce to create a new run with identical settings
Best Practices
Data Management
- Version everything — Datasets, recipes, and configs
- Use meaningful names — Include date, purpose, version
- Document assumptions — Note data quality issues
- Review data quality — Before and after transformations
Training Workflows
- Start small — Test with subset before full training
- Track everything — Log all experiments, even failures
- Compare systematically — Use experiment comparison tools
- Automate validation — Set up automated quality checks
Deployment Safety
- Use staging — Always test in staging first
- Monitor closely — Especially after new deployments
- Have rollback plans — Know how to revert quickly
- Set up alerts — For latency, errors, drift
Common Workflows Summary
| Workflow | Services Used | Key Steps |
|---|---|---|
| Train a classifier | Synthex → Model Training | Import data → Create profile → Train → Evaluate |
| Deploy a model | Model Training → ModelOps | Register model → Create deployment → Monitor |
| Build RAG pipeline | RAG + Synthex | Add documents → Configure retrieval → Test with synthetic queries |
| Generate test data | Synthex | Create profile → Generate synthetic → Export |
Next Steps
For deeper dives into specific services:
- Synthex User Guide — Comprehensive synthetic data guide
- Backend Services Overview — All service documentation
For help with specific tasks, use the in-app help or return to the User Guide.