# LSTM Trading Bot - Agent Development Guide

## Project Overview

This document provides guidelines for automated agents (Cursor, Copilot, Claude, etc.) and human contributors working on the LSTM-based multi-timeframe trading bot project.

## 1. Required Checks

### Code Quality
- **Linting**: Run `python -m py_compile` on all Python files
- **Type Checking**: Ensure proper type hints throughout
- **Import Validation**: Verify all imports work correctly
- **Configuration Validation**: Test YAML configuration loading

### Testing
- **Unit Tests**: Run tests for individual modules
- **Integration Tests**: Test data pipeline and model training
- **Backtesting Validation**: Verify strategy performance
- **Import Tests**: Ensure all modules can be imported

### Dependency Management
- **Requirements Check**: Verify `requirements.txt` includes all dependencies
- **Version Compatibility**: Ensure PyTorch, pandas, etc. versions are compatible
- **Optional Dependencies**: Test functionality without optional packages

## 2. Project Structure

```
├── config/                    # Configuration files
│   └── config.yaml           # Main configuration
├── data/                     # Data storage (gitignored)
│   ├── raw/                 # Raw OHLCV data
│   └── cache/               # Processed features
├── notebooks/                # Jupyter notebooks
│   ├── 01_data_eda.ipynb    # Data exploration
│   ├── 02_train_optuna.ipynb # Model training
│   ├── 03_backtest.ipynb     # Backtesting
│   └── 04_live_trading.ipynb # Live trading
├── src/                      # Source code
│   ├── data/                # Data loading
│   │   └── loaders.py
│   ├── features/            # Feature engineering
│   │   ├── build_dataset.py
│   │   └── indicators.py
│   ├── models/              # LSTM architectures
│   │   └── lstm_fusion.py
│   ├── training/            # Training framework
│   │   ├── train.py
│   │   └── metrics.py
│   ├── backtest/            # Backtesting engine
│   │   ├── engine.py
│   │   └── strategy.py
│   ├── live/                # Live trading
│   │   ├── broker_base.py
│   │   ├── broker_alpaca.py
│   │   ├── streamer.py
│   │   └── executor.py
│   └── utils/               # Utilities
│       ├── config.py
│       ├── logging.py
│       ├── seeds.py
│       └── times.py
├── tests/                   # Test files
├── docs/                    # Documentation
│   └── agents/              # Agent-specific docs
└── requirements.txt          # Dependencies
```

## 3. Coding Conventions

### Python Standards
- **Language**: Python 3.10+ with type hints
- **Style**: Follow PEP 8 with 4-space indentation
- **Imports**: Standard library first, then third-party, then local
- **Documentation**: Use Google-style docstrings
- **Error Handling**: Comprehensive try-catch with logging

### Project-Specific Conventions
- **Configuration**: All parameters in `config/config.yaml`
- **Logging**: Use structured logging with `src.utils.logging`
- **Error Handling**: Graceful degradation with informative messages
- **Testing**: Unit tests for all public functions

### File Naming
- **Modules**: `snake_case.py`
- **Classes**: `PascalCase`
- **Functions**: `snake_case`
- **Constants**: `UPPER_SNAKE_CASE`

## 4. Development Workflow

### Agent Development Process
1. **Read Memory**: Check `docs/agents/ledger.json` for existing work
2. **Understand Intent**: Clarify requirements before implementation
3. **Plan Changes**: Create minimal, testable implementation plan
4. **Implement**: Write clean, documented code
5. **Test**: Validate functionality and edge cases
6. **Document**: Update relevant documentation
7. **Log**: Add entry to `docs/agents/ledger.json`

### Human Contribution Process
1. **Issue Creation**: Create GitHub issue for proposed changes
2. **Branch Creation**: Create feature branch from `main`
3. **Implementation**: Follow coding conventions
4. **Testing**: Run all relevant tests
5. **PR Creation**: Submit pull request with description
6. **Review**: Address review comments
7. **Merge**: Merge after approval

## 5. Module-Specific Guidelines

### Data Pipeline (`src/data/`, `src/features/`)
- **Data Sources**: Support CSV, Alpaca, Binance, OANDA
- **Feature Engineering**: Technical indicators, lagged features, calendar features
- **Data Quality**: Handle missing data, outliers, and anomalies
- **Performance**: Efficient processing for large datasets

### Model Architecture (`src/models/`)
- **LSTM Variants**: Single LSTM, multi-LSTM fusion
- **Attention Mechanisms**: Multi-head attention implementation
- **Regularization**: Dropout, layer normalization
- **Output Modes**: Regression and classification

### Training Framework (`src/training/`)
- **Optimization**: Optuna hyperparameter tuning
- **Validation**: Walk-forward cross-validation
- **Metrics**: Trading-specific performance measures
- **Checkpointing**: Model saving and loading

### Backtesting (`src/backtest/`)
- **Framework**: Backtrader integration
- **Risk Management**: Position sizing, stop losses
- **Performance**: Comprehensive metrics and reporting
- **Realism**: Slippage, commission, latency modeling

### Live Trading (`src/live/`)
- **Broker Abstraction**: Support multiple exchanges
- **Streaming**: Real-time market data
- **Execution**: Order management and risk controls
- **Monitoring**: Performance tracking and alerting

## 6. Configuration Management

### Main Configuration (`config/config.yaml`)
- **Data Settings**: Symbols, timeframes, sources
- **Model Parameters**: Architecture, hyperparameters
- **Training Settings**: Batch size, epochs, optimization
- **Risk Parameters**: Position sizing, circuit breakers
- **Live Trading**: Broker settings, execution parameters

### Environment Variables
```bash
# Broker API credentials
API_KEY_ALPACA=your_key
API_SECRET_ALPACA=your_secret
API_KEY_BINANCE=your_key
API_KEY_OANDA=your_key

# Optional: Alert webhooks
SLACK_WEBHOOK=your_webhook_url
DISCORD_WEBHOOK=your_webhook_url
```

## 7. Testing Strategy

### Unit Tests
- **Data Loaders**: Test data loading from all sources
- **Feature Engineering**: Validate indicator calculations
- **Model Components**: Test LSTM layers and fusion strategies
- **Utilities**: Test configuration and logging

### Integration Tests
- **End-to-End Pipeline**: Data loading → features → training → prediction
- **Backtesting**: Strategy execution with realistic conditions
- **Live Trading**: Paper trading execution and risk management

### Performance Tests
- **Training Speed**: Model training time on different hardware
- **Memory Usage**: Memory consumption during processing
- **Scalability**: Performance with larger datasets

## 8. Deployment Guidelines

### Google Colab
1. Upload project files to Colab environment
2. Install dependencies: `!pip install -r requirements.txt`
3. Run notebooks in sequence for complete workflow
4. Use GPU runtime for faster training

### Linux Server
1. **System Setup**: Ubuntu/Debian with Python 3.10+
2. **Dependencies**: Install via `pip install -r requirements.txt`
3. **Configuration**: Set environment variables for API keys
4. **Service Setup**: Use systemd for 24/7 operation
5. **Monitoring**: Configure logging and alerting

### Docker Deployment
```dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "main.py", "live", "run", "--mode", "paper"]
```

## 9. Risk Management

### Position Sizing
- **Fixed Percentage**: Fixed % of portfolio per trade
- **Kelly Criterion**: Optimal position sizing based on win rate
- **Volatility Adjusted**: Position size based on market volatility

### Circuit Breakers
- **Drawdown Limits**: Stop trading if portfolio drops too much
- **Daily Loss Limits**: Stop trading if daily loss exceeds threshold
- **Position Limits**: Maximum number of concurrent positions
- **Time-based**: Stop trading during high-risk periods

### Monitoring
- **Performance Tracking**: Real-time P&L and risk metrics
- **Alert System**: Notifications for risk events
- **Health Checks**: System status and connectivity monitoring

## 10. Performance Optimization

### Training Optimization
- **GPU Acceleration**: Use CUDA for faster training
- **Batch Processing**: Optimize batch sizes for hardware
- **Mixed Precision**: Use float16 for memory efficiency
- **Model Parallelism**: Distribute model across multiple GPUs

### Inference Optimization
- **Model Quantization**: Reduce model size for faster inference
- **Batch Inference**: Process multiple predictions together
- **Caching**: Cache frequent calculations and features

### Data Optimization
- **Efficient Storage**: Use Parquet for compressed data storage
- **Lazy Loading**: Load data only when needed
- **Memory Mapping**: Use memory-mapped files for large datasets

## 11. Troubleshooting

### Common Issues
- **Import Errors**: Check Python path and dependencies
- **Configuration Errors**: Validate YAML syntax and required fields
- **Data Issues**: Check data format and missing values
- **Model Errors**: Verify tensor shapes and data types

### Debugging Tools
- **Logging**: Comprehensive logging at all levels
- **Profiling**: Performance profiling for bottlenecks
- **Visualization**: Plot data and model outputs for inspection
- **Interactive Debugging**: Use IPython for step-through debugging

## 12. Future Enhancements

### Model Improvements
- **Transformer Architectures**: Add attention-based models
- **Ensemble Methods**: Combine multiple model predictions
- **Reinforcement Learning**: Train using RL for better adaptation

### Feature Enhancements
- **Alternative Data**: News, sentiment, macroeconomic indicators
- **Advanced Indicators**: More sophisticated technical analysis
- **Regime Detection**: Machine learning-based market regime classification

### System Improvements
- **Multi-Asset Trading**: Handle multiple asset classes
- **Portfolio Optimization**: Modern portfolio theory integration
- **Risk Parity**: Equal risk contribution across positions

## 13. Contributing Guidelines

### Code Contributions
1. Follow existing code style and conventions
2. Add tests for new functionality
3. Update documentation for API changes
4. Use meaningful commit messages

### Issue Management
- Use clear, descriptive issue titles
- Provide detailed descriptions and reproduction steps
- Label issues appropriately (bug, enhancement, documentation)
- Reference related issues and PRs

### Review Process
- All PRs require at least one review
- Address review comments promptly
- Ensure CI checks pass before merge
- Update changelog for significant changes

## 14. Maintenance

### Regular Tasks
- **Model Retraining**: Update models with new data
- **Performance Review**: Analyze live vs backtested performance
- **Risk Review**: Adjust risk parameters based on performance
- **Dependency Updates**: Keep packages current and secure

### Monitoring
- **System Health**: Monitor server and application health
- **Performance Metrics**: Track key performance indicators
- **Error Rates**: Monitor and address error patterns
- **Resource Usage**: Track CPU, memory, and disk usage

## 15. Support and Resources

### Documentation
- **README.md**: Project overview and setup instructions
- **API Documentation**: Generated from docstrings
- **Colab Notebooks**: Interactive tutorials and examples
- **Configuration Guide**: Detailed parameter explanations

### Community
- **GitHub Issues**: Bug reports and feature requests
- **Discussions**: General questions and discussions
- **Wiki**: Additional documentation and guides

### Development Tools
- **IDE Support**: Cursor, VS Code with Python extensions
- **Linting**: Pylint, flake8 for code quality
- **Formatting**: Black for consistent code formatting
- **Testing**: Pytest for unit and integration tests

---

This guide ensures consistent development practices and high-quality code across all contributors and automated agents working on the LSTM trading bot project.