A collection of thoughts, explorations, and research findings in the field of Generative AI and Cognitive Science.
By applying concentration inequalities like Hoeffding’s, we derive deterministic bounds for uncertain processes, providing the foundation for UCB and TD learning.
The mathematical principles of human-computer interaction, including 2D set prediction, open-vocabulary generalization, and 3D physical reconstruction. This research explores how modern architectures utilize Transformer and diffusion models to address complex inverse problems in perception and generation processes.
By using a unit quaternion constructed with half-angles in conjunction with the conjugate transformation operator, it is not only geometrically equivalent to the Rodrigues rotation formula, but also enables stable, efficient and distance-preserving spatial attitude control.
Represents a 3D scene as an explicit set of anisotropic 3D Gaussians, each defined by parameters for position, covariance, opacity, and view-dependent color via Spherical Harmonics. Optimizes all parameters from images using stochastic gradient descent, coupled with an adaptive density control mechanism that dynamically clones, splits, and prunes the Gaussians to create a detailed and efficient scene representation.
Statistical mechanics is built upon the postulate of equal probability, which bridges deterministic microscopic mechanics and macroscopic thermodynamics. This foundation allows for the derivation of entropy and temperature, ultimately unifying classical and quantum descriptions through ensemble theory.
Analyzed the difficulty in deep latent variable models where the marginal likelihood is intractable and cannot be directly optimized, and introduced the framework of variational inference, where an encoder is used to approximate the true posterior distribution, thereby transforming the problem into the optimization of the Evidence Lower Bound (ELBO). Applied the reparameterization trick to enable gradient-based optimization of the ELBO by making the sampling process differentiable, and ultimately achieved end-to-end training of the entire generative model, including both the encoder and the decoder.