Your brain has ~86 billion neurons. Each receives signals from others, weighs their importance, sums them, and fires if the total is strong enough. An artificial neuron does exactly this — with numbers. It's the single indivisible unit that, when stacked in layers, can recognise faces, translate languages, and play chess at superhuman level.
The computation inside a neuron
Step 1 — Weighted sum:
z = w₁·x₁ + w₂·x₂ + w₃·x₃ + b
= Σ(wᵢ·xᵢ) + b
Step 2 — Activation:
output = σ(z)
where:
x₁, x₂, x₃ = input values
w₁, w₂, w₃ = learned weights (importance of each input)
b = bias (shifts the threshold up or down)
σ = activation function (ReLU, sigmoid, tanh...)
z = pre-activation (raw weighted sum)
Example (spam detection, single neuron):
x₁ = 'free' word count = 3
x₂ = 'click here' count = 1
x₃ = sender known = 0
w₁=0.8, w₂=0.6, w₃=−0.9, b=−0.5
z = 0.8×3 + 0.6×1 + (−0.9)×0 + (−0.5) = 2.5
output = sigmoid(2.5) = 0.92 → likely spamThe weighted sum z is called the pre-activation or logit. σ(z) is the neuron's output.
The role of bias
The bias b is a constant added to the weighted sum. Without bias, the neuron's decision boundary always passes through the origin — it can only make decisions about patterns centred at zero. Bias shifts the boundary freely in any direction. Think of it as a learnable threshold: high bias → neuron activates more easily; negative bias → harder to activate.
Why weights and bias are both needed
From one neuron to a network
- A single neuron = a linear model (like logistic regression) — limited to linear decisions
- Stack neurons in a layer → each learns a different linear combination of inputs
- Add an activation function (non-linear) between layers → now the network can learn curves, spirals, any shape
- Stack multiple layers → deeper features: edges → shapes → objects → concepts
- Millions of neurons × millions of weights = the capacity to model almost anything