35. Stochastic Chemical Kinetics 2 by MIT OpenCourseWare

Description

35. Stochastic Chemical Kinetics 2 by MIT OpenCourseWare

Summary by www.lecturesummary.com: 35. Stochastic Chemical Kinetics 2 by MIT OpenCourseWare


      • Introduction/Lecture Wrap-up

        • Final Lecture: The lecture is the final one by the speaker, concluding stochastic methods.
        • Review Lecture: Professor Swan will deliver a review lecture on Wednesday.
        • Final Exam: The final exam is a week from the date of the lecture.

        Multi-dimensional Integrals and Metropolis Monte Carlo Challenges

        • Scientific Integrals: Many scientific multi-dimensional integrals are of the type of a probability density P times a function f.
        • Unknown Density: In many cases, the probability density P is unknown, but a waiting factor W is known.
        • High Dimensions: The integrals may be over extremely high dimensional spaces (e.g., 10 to the 23rd dimensions).
        • Examples: Examples are the Boltzmann distribution and Bayesian analysis of experiments.
        • Likelihood Function: In Bayesian analysis, the likelihood function (data given parameters) times the prior knowledge of parameters provides a waiting factor W for integrals in terms of parameters.
        • Evaluation: This enables the evaluation of multi-dimensional integrals over parameters to calculate quantities such as the expectation value of a function of those parameters.
        • Metropolis Monte Carlo: Whenever you see integrals with waiting factors, you should take seriously the idea of using Metropolis Monte Carlo.
        • Bayesian Formula: The Bayesian formula is crucial because it gives the shape of the probability density of parameters, such as correlations and prior knowledge.
        • Step Size Selection: Metropolis Monte Carlo may be tricky, especially in selecting the step size (delta) in each dimension.
        • Step Size Issues: Selecting delta too big results in trying to make large leaps to unlikely states, which get rejected, leading to having the same state over and over again.
        • Small Delta: Selecting delta too small makes steps get accepted nearly always, but the entire range is not sampled.
        • Sampling Coverage: Coverage of sampling can be verified by plotting the distance from the starting state; an extremely small distance could mean poor coverage.

        Adaptive Step Sizing

        • Adaptive Algorithms: Good algorithms attempt to make the step size (delta) adaptive.
        • State Acceptance: If states are seldom accepted, delta could be reduced; if always accepted, it could be expanded.
        • Transition Acceptance: One usually aims to accept most transitions (e.g., 0.8 to 0.9 probability), but not necessarily all of them, for efficiency.

        Initial Conditions and Accuracy

        • Avoiding Sensitivity: Avoiding sensitivity to the initial guess entails taking sufficient steps to sample everywhere.
        • Initial Conditions Check: It is a good habit to begin from various initial conditions and check that the value of the integral remains the same.
        • Sampling Problem: If the answer is sensitive to the initial condition, there is a sampling problem.
        • Demonstration: A demonstration using the dihedral angle of hydrogen peroxide illustrates that sampling only one symmetrical lobe and beginning with it will still yield a decent mean but, if the distribution were asymmetrical, would produce the incorrect answer.
        • Physically Accessible Region: It is important to make sure you are sampling over all of the physically accessible region.
        • Critical Understanding: It is absolutely critical to have an idea of the answer (or roughly, units, order of magnitude) before starting a calculation to check for bugs or reasonableness.
        • Monte Carlo Application: This concept is applied to Monte Carlo: getting a result far from the expected value (e.g., dipole moment) indicates a mistake, possibly due to incorrect sampling (e.g., wrong delta).
        • High Accuracy: It takes an extremely large number of states to achieve high accuracy in Monte Carlo.
        • Accuracy Proportionality: Accuracy is proportional to 1 over the square root of N (number of samples).
        • Significant Figures: Although you can achieve decent accuracy using relatively few samples at first, achieving additional significant figures takes proportionately enormous effort (e.g., 100 times as many samples for one more significant figure).
        • Speed Comparison: Monte Carlo can be significantly faster than procedures such as the trapezoid rule in multi-dimensions for quickly getting a few significant figures.

        Problem Simplification (Dimensionality Reduction)

        • Dimensionality Relief: Dimensionality reduction can be a big relief, exemplified by simplifying an assigned problem from 12 coordinates to 6.
        • Anticipated Answer: Having in mind the anticipated answer can direct simplifications, like eliminating translational or rotational degrees of freedom if they

          Step Size Effects and Initial Estimates

          • 9:30: Altering the maximum step size influences the percentage of accepted Metropolis Monte Carlo steps.
          • 9:30: The average value may remain approximately the same despite varying step sizes, perhaps because of symmetry in the potential or in the property being averaged.
          • 9:30: Early samples of the Metropolis method yield a value that is of the order of magnitude of the anticipated average even in the case of poor initial sampling.

          Acceptance Ratio and Sampling Problems

          • 10:20: A common rule of thumb is to aim for an acceptance ratio between 0.2 and 0.8.
          • 10:20: Sampling distant states or different "lobes" in a distribution might require quite long steps.
          • 10:20: Pre-knowledge about the expected shape of the distribution is helpful for choosing parameters.

          Kinetic Monte Carlo (for Discrete States)

          • 11:21: Kinetic Monte Carlo (KMC) is used for cases where the probability distribution is not stationary and no waiting factor W is explicitly given.
          • 11:21: This is appropriate for kinetics equations.
          • 11:21: Even though the differential equation appears trivial, the number of states can be vast (e.g., 4^100 for a 10x10 catalytic surface with 4 states per site).
          • 11:21: For the example of a catalytic surface, the number of states is on the order of 10^60.
          • 11:21: The matrix M would contain (10^60)^2 = 10^120 elements, which cannot be stored.
          • 11:21: Because the number of states is so large, it is not possible to sample them all, and none of them might ever be visited.
          • 11:21: Thus, kinetic Monte Carlo, i.e., the Gillespie algorithm, is generally applied to these types of problems.

          Gillespie Algorithm Details

          • 11:21: The Gillespie algorithm is applied for KMC simulations.
          • 11:21: To compute a Gillespie trajectory, two random numbers are needed: one for the time until the next event and one for which event happens.
          • 14:47: The cost of KMC involves computing random numbers.
          • 14:47: The number of random numbers depends on the simulation time and the average time between events (delta t).
          • 14:47: When delta t is small and the time for the simulation is large, a single trajectory is costly.

          Time in KMC

          • 13:42: Calculation of time in the Gillespie algorithm is required if you are interested in the time evolution of the system.
          • 13:42: If you only want the steady-state solution, it may be possible to simulate without calculating the time between events explicitly.

          KMC Cost and Sampling Limitations

          • 14:47: You will get a lot of low probability states that will not be sampled.
          • 14:47: Some high probability states will also not be sampled if the total samples are insufficient.
          • 14:47: The accuracy scaling is once more 1 over the square root of N, so exact results are hard to obtain.

          Data Storage and Analysis in KMC

          • 15:59: Lots of trajectories have to be run in KMC.
          • 15:59: It's usually not possible to store all trajectories due to the volume of data.
          • 15:59: It is best to compute averages and other desired quantities on the fly as the simulation runs.
          • 15:59: This requires deciding what diagnostics to compute beforehand and coding them into the simulation.
          • 15:59: The running averages only have to be saved, not details on each state sequence.

          Initial Conditions (Poisson Distribution)

          • 16:49: KMC similarly has an initial condition issue.
          • 16:49: Macroscopically, you may have the average value of values (e.g., average number of surface molecules).
          • 16:49: You don't have the discrete exact state or site correlations.
          • 16:49: We usually apply a Poisson distribution to calculate the probability of having a certain number of species

            KMC and MD Overview

            16:49: The Poisson distribution formula depends on the average number of species.

            17:53: When beginning a new KMC trajectory, you should sample from the Poisson distribution to obtain a new initial condition for the counts of various species.

            17:53: This is sampling over both the initial conditions and the events that follow.

            KMC Optimization - Handling Time Scales

            • 18:14: Because KMC problems with large numbers of states are challenging, there need to be ways to speed them up.
            • 18:14: One way is to recognize and examine the fast processes, such as diffusion.
            • 18:14: If fast processes (e.g., diffusion) are not essential to the desired product (e.g., reaction), they can be treated differently.
            • 18:14: Alternatives include assuming infinitely rapid diffusion or slowing down diffusion artificially to match slower processes.
            • 18:14: Slowing down rapid processes artificially can significantly speed up calculations.
            • 21:55: Including rapid diffusion with its actual timescale may result in costly trajectories and restricted sampling.

            KMC Optimization - Handling Low Probability Events and Sampling Error

            • 20:17: Extremely low probability events may never be sampled.
            • 20:17: If a process occurs on a much slower timescale than the simulation can achieve, it may be excluded from the calculation.
            • 20:17: Eliminating sluggish processes decreases the number of states.
            • 20:17: Sufficient sampling requires knowledge of the margin of error, which is greater for rare events.
            • 21:55: A quick response will have adequate sample statistics, but a slow response may not yield reliable conclusions.

            Molecular Dynamics

            • 22:28: Molecular Dynamics (MD) solves equations of motion for atoms or clusters, usually applying Newton's equations.
            • 22:28: Force fields, optimized to experimental and/or quantum chemistry results, explain interactions.
            • 22:28: MD is usually classical, but quantum mechanical effects can be incorporated.

            MD Algorithms and Techniques

            • 23:43: The velocity Verlet algorithm is popular for conserving energy over many steps.
            • 23:43: Thermostats simulate contact with a thermal bath and adjust velocities periodically.
            • 23:43: Thermostats are useful in chemical kinetics simulations to witness infrequent, high-energy events.

            MD vs. Metropolis Monte Carlo

            • 25:00: MD may serve as an alternative to Metropolis Monte Carlo for computing multi-dimensional integrals.
            • 25:00: MD's timescale is determined by actual physical movements, eliminating the need for a step size parameter.
            • 25:00: MD is time-accurate, requiring extremely small delta t.
            • 25:00: If the objective is a steady-state property, Metropolis Monte Carlo may be preferable.

            MD Use Cases and Limitations (Time Scales)

            • 26:50: MD is well-suited for calculating time-dependent properties on picosecond to nanosecond timescales.
            • 26:50: Examples include energy transfer mechanisms on picosecond timescales.
            • 26:50: A very small delta t restricts overall simulation time, often to nanoseconds.
            • 26:50: MD is not feasible for processes that relax to equilibrium on longer timescales.
            • 26:50: MD excels at sampling dynamics within a given conformation at nanosecond timescales.

            MD Initial Conditions

            • 28:30: MD has an initial condition issue with placing atoms.
            • 28:30: Sampling over varying molecular arrangements is worthwhile.
            • 28:30: MD cannot track slowly evolving conformational changes alone.
            • 28:30: A separate sampling technique may be needed to initialize the system.

            Comparing Methods (MC, KMC, MD)

            • 29:16: Metropolis Monte Carlo, Kinetic Monte Carlo, and Molecular Dynamics are tools for different problems.
            • 29:16: Each method is not best suited for

              Selecting a Tool

              30:00: Selecting a tool that suits the particular issue you are attempting to address is important.

              Q&A

              30:20: The rest of the class time is designated for questions.

              Choosing P and F in Monte Carlo, Marginal Integrals

              • 30:36: A question is raised about how to choose the probability distribution P or waiting factor W when evaluating an integral G(x).
              • 30:36: The approach is to rewrite the integral G(x) as P(x) * F(x).
              • 30:36: Using a uniform distribution for P is possible but inefficient as it samples low-probability regions.
              • 30:36: The goal is to find a P*F factorization that works best for sampling.
              • 30:36: "Working best" implies setting F as constant as feasible and P as sharply focused as feasible.
              • 30:36: Balancing a sharp P with a flat F is a trade-off.
              • 30:36: Default options include using the Boltzmann factor in statistical mechanics problems or the provided W in Bayesian analysis.
              • 30:36: You typically have a joint probability density or waiting factor W(theta1, theta2, ...), but only need to consider a subset of parameters.
              • 30:36: You can perform a "marginal integral" by summing out the degrees of freedom you don't care about.
              • 30:36: Marginalizing out degrees of freedom is a convenient trick applicable to both Bayesian and Boltzmann problems.
              • 30:36: What you do specifically depends on what system properties you are interested in.

              Course Logistics

              • 34:00: Final review session schedule discussion (Wednesday evening or Friday morning vote).
              • 34:00: There are times scheduled today for review sessions, one of which is on Kinetic Monte Carlo.
              • 34:00: The solution to the previously graded homework will be made available soon.

              Closing Remarks

              34:45: Best of luck with studying and the exam.