Solving Logic with Simplicity: A Scalar-Magnitude Neural Network That Cracks XOR

In this personal experiment, I explored a novel neural network design built on the idea of using a scalar magnitude-based activation — effectively replacing conventional nonlinear functions like ReLU or sigmoid with a simple identity function (y = x), such that x is computed as a vector magnitude.

The input projection is transformed using this formula:

    x = || Σ aᵢ · i · eᵢ || = sqrt(Σ (aᵢ · i)²)

Where:

aᵢ are learned scalar weights per feature
i is the positional index of each input
eᵢ is the basis vector

Despite the simplicity, this architecture not only solves the linearly separable OR gate problem, but also successfully learns the non-linearly separable XOR gate — a common benchmark used to evaluate the expressiveness of neural networks.

To be confirmed by future experiments, hypothetically, scaling the z-score with sqrt(1/n) per feature is effective for normalizing the sum of squared values and maintaining balanced feature contributions in this method.

🔍 Why This Matters

This concept challenges the assumption that non-linear activation functions are always necessary for learning non-linear decision boundaries. By projecting weighted input features into a scalar magnitude using positional indexing, and then passing that through a simple identity activation (y = x), the network surprisingly learns both linear (OR) and non-linear (XOR) tasks.

While demonstrated here on logic gates, the simplicity of the architecture opens up a deeper question:

Could this scalar magnitude mechanism generalize to deeper models or real-world tasks?

If future experiments prove effective on more complex datasets or architectures (e.g., image classifiers, sequence models), it may offer a lightweight, interpretable alternative to traditional activation-heavy networks — especially in constrained environments or theoretical investigations.

One-liner summary:

The model z-scores the inputs, applies a scalar magnitude projection over position-weighted inputs, uses identity activation (y = x), and trains via gradient descent.

📊 Statistics Summary

The following statistics were gathered from training two models on the classic XOR gate problem using PyTorch, running on Google Colab:

Model 1: Conventional Neural Network (ReLU)

Task: XOR

Accuracy: 0.75

Final Loss: 0.3614

Training Time: 0.1992 seconds approx

Estimated FLOPs: ≈36

Model 2: Custom Scalar-Magnitude Neural Network

Task: XOR

Accuracy: 1.00

Final Loss: 0.0000

Training Time: 0.2446 seconds approx

Estimated FLOPs: ≈48

This post is part of a broader investigation into simplified activations in neural networks — more results to follow.

© 2025 Paul KP Fung. You may copy and share for non-commercial purposes with proper attribution. Commercial use requires explicit permission.

Medium Post

📁 Download & Run

👉 Open the Colab Notebook
Includes code, toggles for OR/XOR, metrics tracking, and training loop.

📜 License

This work is released under the Creative Commons Attribution–NonCommercial 4.0 International (CC BY-NC 4.0) license. https://creativecommons.org/licenses/by-nc/4.0/legalcode

You may share or adapt this work for non-commercial use.
You must give proper credit:

Author: Paul KP Fung
Co-created with: ChatGPT by OpenAI
Libraries: PyTorch, NumPy

Commercial use requires explicit permission.

📧 For collaboration, licensing, or commercial use inquiries, please contact: machinesmartsor@gmail.com

Generated in part with the help of AI. All experiments were conducted on Google Colab using free-tier or standard GPUs. Performance numbers are approximate.

⚠️ Disclaimer

This post presents a personal, exploratory experiment conducted using publicly available tools and platforms (e.g., PyTorch, Google Colab). The results, interpretations, and conclusions are based on small-scale testing and have not been peer-reviewed or formally validated. Performance numbers are approximate and may vary under different settings.

Readers are encouraged to interpret the content with discretion. Use of any part of this work for research, experimentation, or development is at the reader’s own risk. The content is intended for educational and exploratory purposes only.

Search This Blog

Explorations in Innovative Science