What is the definition, algorithm, advantages and disadvantages of online learning, and what are the differences between it and batch learning?

Online learning, in the context of machine learning, is a methodology where a model is updated incrementally, one data point (or one small mini-batch) at a time, as each new observation arrives. This stands in stark contrast to batch learning, where the model is trained once on the entire available dataset. The core algorithm for online learning typically involves an initial model, often with random weights, which is then updated for each new training example using a rule like stochastic gradient descent (SGD). For a new data point, the algorithm makes a prediction, computes the loss or error, and then immediately adjusts the model parameters in the opposite direction of the gradient of that loss. This sequential, instance-by-instance update is the defining computational mechanism, enabling the model to evolve continuously without revisiting past data.

The primary advantages of online learning are its efficiency with massive or streaming data and its adaptability to non-stationary environments. Because it processes data points individually, it has a constant, low memory footprint, making it feasible for applications where the dataset is too large to fit in memory or is conceptually infinite, such as real-time sensor feeds or user clickstream data. Its incremental nature allows it to track changes in the underlying data distribution over time, a critical feature for domains like financial markets or trending topics where patterns drift. The disadvantages, however, are significant. The model's path to convergence can be noisy and less stable than batch methods, as each update is based on a single, potentially unrepresentative, example. It is also highly sensitive to the order of data presentation; a stream of misleading examples can catastrophically degrade performance. Furthermore, once a data point is processed, it is typically discarded, meaning the model cannot easily "relearn" from valuable old data if the concept shifts back, a limitation known as catastrophic forgetting.

The differences between online and batch learning are foundational. Batch learning assumes the availability of a complete, static training set, performs computationally intensive passes over this entire set (epochs), and produces a final model. This allows for precise optimization, robust convergence analysis, and techniques like exact validation. However, it is computationally expensive to retrain from scratch when new data arrives and is ill-suited for streaming contexts. Online learning, by design, has no concept of a fixed dataset or multiple epochs over static data. The model is in a perpetual state of training, which trades the optimality and stability of batch processing for immediate updates and constant operational readiness. This makes the choice between them a direct function of the problem constraints: batch learning is preferred for well-curated, finite datasets where accuracy and stability are paramount, while online learning is a necessity for live systems that must learn from an unbounded, evolving data stream in real-time.