What is Multivariate Regression? #
Multivariate Regression is a statistical and machine learning method used when you want to predict more than one output variable simultaneously. Instead of creating separate models for each target, this approach builds a single unified model that learns the relationships between input features and multiple outputs at the same time.
This is particularly powerful when the output variables are interrelated, because the model can capture and use those relationships to improve prediction accuracy.

Key Characteristics #
- Produces two or more outputs in one prediction
- Uses a single model to handle multiple target variables
- Relies on multiple input features for prediction
- Performs better when outputs are correlated or dependent
- Reduces redundancy compared to building separate models
Why Use Multivariate Regression? #
In many real-world problems, outputs are not independent. Modeling them together allows the algorithm to:
- Capture shared patterns between targets
- Improve overall prediction performance
- Reduce training time and complexity
- Provide more realistic and consistent predictions
Example #
Imagine you want to predict a student’s performance. Instead of predicting each subject separately, you can build one model that predicts both:
- Math marks
- Science marks
using inputs such as:
- Study hours
- Class attendance
Since performance in different subjects is often related, the model can learn these connections and make better joint predictions.
Architecture of Multivariate Regression #
To understand multivariate regression properly, it helps to start with a quick recap of simple linear regression. In simple linear regression, we predict a single output using one input feature. As we move forward, multiple linear regression allows several inputs but still predicts only one output.
Multivariate regression goes a step further — it enables us to predict multiple outputs at the same time using one unified model. Instead of writing separate equations for each target variable, we represent everything in a matrix form, which makes computations efficient and scalable.
Mathematical Representation #
The general form of multivariate regression is:
Although this equation looks simple, it actually represents a system where multiple outputs are predicted simultaneously in a structured way.
Components of the Equation #
Y (Output Matrix) #
- Contains all the target variables
- Shape: (n × p) → where n = number of observations, p = number of outputs
- Example: Math and Science scores
X (Input Matrix) #
- Includes all input features
- Shape: (n × (k+1)) → includes intercept term
- Example: Study hours, attendance
B (Coefficient Matrix) #
- Stores weights (coefficients) for each feature-output pair
- Shape: ((k+1) × p)
- Each column corresponds to one output variable
ε (Error Matrix) #
- Represents the difference between actual and predicted values
- Captures noise or unexplained variation in the data
How It Works #
Instead of solving separate equations like:
- Math = f(inputs)
- Science = f(inputs)
Multivariate regression solves them together in one system. This allows the model to:
- Learn shared patterns between outputs
- Capture relationships among target variables
- Improve prediction accuracy when outputs are correlated
Working of Multivariate Regression #
Multivariate regression follows a systematic process to learn from data and generate predictions for multiple outputs simultaneously. Instead of handling each target separately, the model works in a step-by-step pipeline, using matrix operations to make the process efficient and scalable.
Step 1: Prepare Input and Output Matrices #
The first step is to organize the dataset into matrix form:
- X (Input Matrix)
Contains all the independent variables (features)
Example: Area, number of rooms - Y (Output Matrix)
Contains multiple dependent variables (targets)
Example: House price and rental value
Structuring data this way allows the model to process multiple outputs in a unified manner.
Step 2: Estimate the Coefficient Matrix #
To determine the optimal weights, we compute the coefficient matrix B using the normal equation adapted for multiple outputs:
B=(XTX)−1XTY
This formula finds the values of B that minimize the overall prediction error across all outputs.
Understanding the Components #
- Xᵀ (Transpose of X)
Converts rows into columns and vice versa - (XᵀX)⁻¹ (Matrix Inverse)
Helps in solving the system of equations uniquely - XᵀY
Captures the relationship between inputs and outputs
By combining these operations, we obtain the best-fit coefficient matrix.
Step 3: Generate Predictions #
Once the coefficient matrix is computed, predictions for all outputs are made simultaneously using:
Y^=XB
Here, Ŷ (Y hat) represents the predicted values for all target variables.
How the Process Works Together #
- Data is structured into matrices
- Coefficients are calculated using linear algebra
- Predictions are generated in one operation
This approach ensures that all outputs are learned jointly, allowing the model to capture dependencies between them.
Implementing Support Vector Machine (SVM) #
Before we code, let’s look at the logical flow of our system:
Import Libraries
import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error
Load Dataset
# Load the dataset
df = pd.read_csv('/kaggle/input/datasets/organizations/ailearner-researchlab/iot-intrusion-detection-hybrid-ml-dl-dataset/final_dataset.csv')
#For demonstration, we'll assume 'df' is already loadeddf.head(10)
Output: 1

Output: 2

Distribution of Flow Duration by Traffic Type (0=Normal, 1=Attack)
# 1. Setting the visual style
sns.set(style="whitegrid")
# 2. Select key features
features_to_plot = ['Flow Duration', 'Total Fwd Packets', 'Protocol']
# 3. Create the figure
plt.figure(figsize=(12, 6))
# FIXED: Added hue='Label' and legend=False to resolve the FutureWarning
sns.violinplot(
x='Label',
y='Flow Duration',
hue='Label',
data=df,
palette='muted',
split=True,
legend=False
)
# 4. Adding titles and labels
plt.title('Distribution of Flow Duration by Traffic Type (0=Normal, 1=Attack)')
plt.xlabel('Traffic Label')
plt.ylabel('Flow Duration (Microseconds)')
# Applying log scale
plt.yscale('log')
plt.show()
Distribution of Normal vs. Attack Traffic
# Setting the style
sns.set_theme(style="whitegrid")
# Create the Bar Plot for Class Counts
plt.figure(figsize=(8, 6))
# FIXED: Added hue='Label' and legend=False
ax = sns.countplot(x='Label', data=df, hue='Label', palette='viridis', legend=False)
# Adding labels for clarity
plt.title('Distribution of Normal vs. Attack Traffic')
plt.xlabel('Traffic Label (0: Normal, 1: Attack)')
plt.ylabel('Number of Samples')
# Adding count annotations on top of bars
for p in ax.patches:
ax.annotate(f'{int(p.get_height())}', (p.get_x() + 0.3, p.get_height() + 5000))
plt.show()
Strip Plot: Flow Duration across Traffic Types
# 1. Set visual style
sns.set_theme(style="ticks")
# 2. Select a feature to visualize
# Sampling for performance
df_sample = df.sample(n=10000, random_state=42)
# 3. Create the Strip Plot
plt.figure(figsize=(10, 6))
sns.stripplot(
data=df_sample,
x='Label',
y='Flow Duration',
hue='Label',
jitter=True,
alpha=0.5,
palette='viridis',
dodge=True,
legend=False # Keeps the plot clean as 'Label' is on the x-axis
)
# 4. Refine the plot
plt.title('Strip Plot: Flow Duration across Traffic Types')
plt.xlabel('Traffic Label (0: Normal, 1: Attack)')
# FIXED: Added 'r' before the string to handle the LaTeX backslash correctly
plt.ylabel(r'Flow Duration ($\mu s$)')
# Using log scale
plt.yscale('log')
plt.show()
Correlation Heatmap: IoT-IDS Feature Relationships
# 1. Calculate the Correlation Matrix
# We calculate how every feature relates to every other feature (-1 to 1)
corr_matrix = df.corr()
# 2. Set up the figure
plt.figure(figsize=(16, 10))
# 3. Create the Heatmap
# cmap='coolwarm': Blue for negative correlation, Red for positive
# annot=True: Shows the numerical value in each cell
sns.heatmap(
corr_matrix,
annot=True,
fmt=".2f",
cmap='coolwarm',
linewidths=0.5,
cbar_kws={"shrink": .8}
)
# 4. Add title and adjust layout
plt.title('Correlation Heatmap: IoT-IDS Feature Relationships', fontsize=20)
plt.xticks(rotation=45, ha='right')
plt.yticks(rotation=0)
plt.tight_layout()
plt.show()
Training Model
# Dropping rows with missing values
df.dropna(inplace=True)
# Defining Features (X) and Target (y)
# We use the engineered features mentioned in your metadata
features = ['Src Port', 'Dst Port', 'Protocol', 'Flow Duration',
'Total Fwd Packet', 'Total Bwd packets', 'Flow IAT Min']
X = df[features]
y = df['Label']
# Split the data (80% Training, 20% Testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Scaling features for better convergence
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)Train the model
# Initialize the model model = LinearRegression() # Fit the model model.fit(X_train_scaled, y_train) # Make predictions y_pred = model.predict(X_test_scaled)
coefficients = pd.DataFrame({'Feature': features, 'Coefficient': model.coef_})
print(coefficients.sort_values(by='Coefficient', ascending=False))Feature Coefficient 6 Flow IAT Min 0.122328 0 Src Port 0.103173 1 Dst Port 0.041321 4 Total Fwd Packet 0.002735 5 Total Bwd packets -0.008765 3 Flow Duration -0.148633 2 Protocol -0.199873
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Squared Error: {mse:.4f}")
print(f"R-squared Score: {r2:.4f}")Mean Squared Error: 0.1237 R-squared Score: 0.3600
Actual vs Predicted Labels (Regression)
plt.figure(figsize=(10, 6))
sns.scatterplot(x=y_test, y=y_pred, alpha=0.1)
plt.plot([0, 1], [0, 1], color='red', linestyle='--') # Perfect prediction line
plt.title('Actual vs Predicted Labels (Regression)')
plt.xlabel('Actual Label (0 or 1)')
plt.ylabel('Predicted Value')
plt.show()
