This proposal suggests a standard for commonly used mathematical notation for machine learning.

The field of machine learning is evolving rapidly in recent years. Communication between different researchers and research groups becomes increasingly important. A key challenge for communication arises from inconsistent notation usages among different papers. This proposal suggests a standard for commonly used mathematical notation for machine learning. In this first version, only some notation are mentioned and more notation are left to be done. This proposal will be regularly updated based on the progress of the field.

You can use this notation by downloading LaTeX macro package MLMath.sty tuned for the updates and you can turn to GitHub for more information.

Notation Table

See the full Guide for more

We’ve heard things like

Chenglong Bao
Assistant Professor, Tsinghua University
The document provides comprehensive, clear mathematical notations that commonly used in machine learning. These consistent definitions are easily accessible for beginners and facilitate the communications of researchers from different backgrounds, which are important for the development of this interdisciplinary subject.
展开
Tiejun Li
Professor, Peking University
The popularity of Machine Learning urgently calls for a unified set of mathematical notations, which would greatly reduce the trouble for researchers to read papers with different notation systems. The ”General Notation for Machine Learning“ organized by Zhiqin Xu, taking a wide range of notation conventions in existing literature into consideration, is an important contribution in this direction.
展开
Pingbing Ming
Professor, ICMSEC
Machine Learning, a highly interdisciplinary and rapidly evolving discipline, has attracted numerous researchers from different areas such as mathematics, physics, chemistry, biology, statistics, engineering and even the humanities and social sciences, and is rapidly having a huge impact on these disciplines. A problem the researchers from different disciplines have to face is how to remove barriers to communication, and one of the main factors is the notation (symbol), which is the language of a discipline actually. For different disciplines, we often find that two notations that appear to be very far apart are likely to express the same concept. A notation that has been recognized as a beautiful and simple symbol in one discipline will be rediscovered and redefined in another, and the newly defined notation is likely to be far less convenient than the existing one there. The General Notations for Machine Learning is a very timely initiative that helps unify the language of machine learning and greatly remove the communication barriers. Of course, this project will also require the active participation and active contributions of researchers from different disciplines to finally achieve the uniformity of primary machine learning notations.
展开
Yang Yuan
Assistant Professor, Tsinghua University
Writing an essay is like telling a story, and the various notations are the language the writer uses.When someone tells a story in a language you are not familiar with, the story, no matter how good it is, can become obscure and even require the necessary translation (replacing the symbols with the ones you know). And this is not uncommon: I have found that not only scholars in different fields, even those in the same field but in different research groups show different preference in symbol system choosing. It does create unnecessary complications for each reader. Therefore, I think it makes sense to provide a common natation standard for the field of machine learning, and I hope it will facilitate researchers and promote the development of the domain.
展开
Haijun Yu
Associate Professor, ICMSEC
Machine learning has been around for more than half a century and is starting to play a vital role in a variety of industries in today's era of big data. However, the knowledge relates to mathematics, probability and statistics, computer theory, and many other disciplines. The mathematical notations and expressions used vary from discipline to discipline and from school to school, causing a lot of potential troubles for the newcomers entering this field. Combining theoretical analysis and practical needs in application, BAAI provides a self-consistent and concise mathematical notation system, the General Notation for Machine Learning. The system is expected to facilitate the reading of literature and the writing of papers in the field of machine learning, and to reduce the problems caused by different notation conventions as well as the misunderstandings that arise.
展开
Zhanxing Zhu
Assistant Professor, Peking University
Machine learning has developed as an interdisciplinary field and impacted many other domains significantly. This has attracted researchers from different domains to involve the development of machine learning, including those from statistics, applied math, physics, computer science, electrical engineering, etc. This definitely raise the requirement to communicate with each other smoothly, and particularly, a consistent notation system is in demand. This proposal is an important starting step towards this goal. Thanks for the authors’ efforts! Indeed, this notation systems requires researchers from all the related fields to contribute and provide suggestions.
展开

Notation Table

See the full Guide for more

symbolmeaningLATEXsimplied
xinput\bm{x}\vx
youtput, label\bm{y}\vy
dinput dimensiond
dooutput dimension d_{\rm o}d_{\rm o}
nnumber of samplesn
Xinstances domain (a set)\mathcal{X}\fX
Ylabels domain (a set)\mathcal{Y}\fY
Z= X × Y example domain\mathcal{Z}\fZ
Hhypothesis space (a set)\mathcal{H}\fH
θa set of parameters\bm{\theta}\vtheta
fθ : X → Yhypothesis function\f_{\bm{\theta}}f_{\vtheta}
f or f ∗ : X → Ytarget functionf,f^*
ℓ : H × Z → R+loss function\ell
Ddistribution of Z\mathcal{D}\fD
S = {zi}ni=1= {(xi, yi)}ni=1 sample set
LS(θ), Ln(θ),empirical risk or training loss
Rn(θ), RS(θ)empirical risk or training loss
LD(θ), RD(θ)population risk or expected loss
σ : R → R+activation function\sigma
wjinput weight\bm{w}_j\vw_j
ajoutput weighta_j
bjbias termb_j
f∑θ(x) or f(x; θ)neural networkf_{\bm{\theta}}f_{\vtheta}
∑mj=1 ajσ(wj · x + bj )two-layer neural network
VCdim(H)VC-dimension of H
Rad(H ◦ S), RadS(H)Rademacher complexity of H on S
Radn(H)Rademacher complexity over samples of size n
GDgradient descent
SGDstochastic gradient descent
Ba batch setB
|B|batch sizeb
ηlearning rate\eta
kdiscretized frequency\bm{k}\vk
ξcontinuous frequency\bm{\xi}\vxi
convolution operation*

And a quick video that explains it