Try It Yourself
Enter values below, then hit Predict to see what the model says.
Raw Dataset (original data)
| school | sex | age | address | famsize | Pstatus | Medu | Fedu | Mjob | Fjob | reason | guardian | traveltime | studytime | failures | schoolsup | famsup | paid | activities | nursery | higher | internet | romantic | famrel | freetime | goout | Dalc | Walc | health | absences | G1 | G2 | G3 |
|---|
Cleaned Dataset (the version we feed to the model)
| G1 | failures | Medu | G3 |
|---|
The Code (how we built this model)
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
df = pd.read_csv('student_scores_original.csv')
df = df[['G1', 'failures', 'Medu', 'G3']]
df = df.dropna()
df = pd.get_dummies(df, drop_first=True)
X = df.drop('G3', axis=1)
y = df['G3']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(r2_score(y_test, y_pred))
Under the Hood (the equation the model learned)
Score₂ = 0.15 + 0.95 × Score₁ − 0.25 × Failures + 0.08 × Mom's Education
Score 1 is by far the biggest driver — nearly 1:1. Each past failure costs about a quarter of a point, and mom's education adds a small boost.
Try the Equation Yourself
Predicted Result
—