Solving Logic with Simplicity: A Scalar-Magnitude Neural Network That Cracks XOR
In this personal experiment, I explored a novel neural network design built on the idea of using
a scalar magnitude-based activation — effectively replacing conventional nonlinear functions like ReLU or sigmoid with a simple identity function (y = x
), such that x
is computed as a vector magnitude.
The input projection is transformed using this formula:
x = || Σ aᵢ · i · eᵢ || = sqrt(Σ (aᵢ · i)²)
Where:
aᵢ
are learned scalar weights per featurei
is the positional index of each inputeᵢ
is the basis vector
Despite the simplicity, this architecture not only solves the linearly separable OR gate problem, but also successfully learns the non-linearly separable XOR gate — a common benchmark used to evaluate the expressiveness of neural networks.
To be confirmed by future experiments, hypothetically, scaling the z-score with sqrt(1/n) per feature is effective for normalizing the sum of squared values and maintaining balanced feature contributions in this method.
🔍 Why This Matters
This concept challenges the assumption that non-linear activation functions are always necessary for learning non-linear decision boundaries. By projecting weighted input features into a scalar magnitude using positional indexing, and then passing that through a simple identity activation (y = x
), the network surprisingly learns both linear (OR) and non-linear (XOR) tasks.
While demonstrated here on logic gates, the simplicity of the architecture opens up a deeper question:
Could this scalar magnitude mechanism generalize to deeper models or real-world tasks?
If future experiments prove effective on more complex datasets or architectures (e.g., image classifiers, sequence models), it may offer a lightweight, interpretable alternative to traditional activation-heavy networks — especially in constrained environments or theoretical investigations.
One-liner summary:
The model z-scores the inputs, applies a scalar magnitude projection over position-weighted inputs, uses identity activation (y = x
), and trains via gradient descent.
📊 Statistics Summary
The following statistics were gathered from training two models on the classic XOR gate problem using PyTorch, running on Google Colab:
-
Model 1: Conventional Neural Network (ReLU)
-
Task: XOR
-
Accuracy: 0.75
-
Final Loss: 0.3614
-
Training Time: 0.1992 seconds approx
-
Estimated FLOPs: ≈36
-
-
Model 2: Custom Scalar-Magnitude Neural Network
-
Task: XOR
-
Accuracy: 1.00
-
Final Loss: 0.0000
-
Training Time: 0.2446 seconds approx
-
Estimated FLOPs: ≈48
-
This post is part of a broader investigation into simplified activations in neural networks — more results to follow.
© 2025 Paul KP Fung. You may copy and share for non-commercial purposes with proper attribution. Commercial use requires explicit permission.
Medium Post
📁 Download & Run
👉 Open the Colab Notebook
Includes code, toggles for OR/XOR, metrics tracking, and training loop.
📜 License
This work is released under the Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0) license. https://creativecommons.org/licenses/by-nc/4.0/legalcode
- You may share or adapt this work for non-commercial use.
- You must give proper credit:
- Author: Paul KP Fung
- Co-created with: ChatGPT by OpenAI
- Libraries: PyTorch, NumPy
- Commercial use requires explicit permission.
📧 For collaboration, licensing, or commercial use inquiries, please contact: machinesmartsor@gmail.com
Generated in part with the help of AI. All experiments were conducted on Google Colab using free-tier or standard GPUs. Performance numbers are approximate.
⚠️ Disclaimer
This post presents a personal, exploratory experiment conducted using publicly available tools and platforms (e.g., PyTorch, Google Colab). The results, interpretations, and conclusions are based on small-scale testing and have not been peer-reviewed or formally validated. Performance numbers are approximate and may vary under different settings.
Readers are encouraged to interpret the content with discretion. Use of any part of this work for research, experimentation, or development is at the reader’s own risk. The content is intended for educational and exploratory purposes only.
Comments
Post a Comment