site:www.marktechpost.com

OpenBMB recently released the MiniCPM3-4B, the third-generation model in the MiniCPM series. This model marks a great step forward in the capabilities of smaller-scale language models. Designed to ...

marktechpost4d

A Systematic Literature Review: Optimization and Acceleration Techniques for LLMs

Large language models (LLMs) have seen remarkable success in natural language processing (NLP). Large-scale deep learning models, especially transformer-based architectures, have grown exponentially ...

marktechpost4d

NiNo: A Novel Machine Learning Approach to Accelerate Neural Network Training through Neuron Interaction and Nowcasting

In deep learning, neural network optimization has long been a crucial area of focus. Training large models like transformers and convolutional networks requires significant computational resources and ...

marktechpost4d

FuXi-2.0: Advancement in Machine Learning ML-based Weather Forecasting for Practical Applications

ML models are increasingly used in weather forecasting, offering accurate predictions and reduced computational costs compared to traditional numerical weather prediction (NWP) models. However, ...

marktechpost4d

Allen Institute for AI Researchers Propose SUPER: A Benchmark for Evaluating the Ability of LLMs to Set Up and Execute Research Experiments

Artificial Intelligence (AI) and Machine Learning (ML) have been transformative in numerous fields, but a significant challenge remains in the reproducibility of experiments. Researchers frequently ...

marktechpost4d

Understanding the Inevitable Nature of Hallucinations in Large Language Models: A Call for Realistic Expectations and Management Strategies

Prior research on Large Language Models (LLMs) demonstrated significant advancements in fluency and accuracy across various tasks, influencing sectors like healthcare and education. This progress ...

marktechpost6d

HNSW, Flat, or Inverted Index: Which Should You Choose for Your Search? This AI Paper Offers Operational Advice for Dense and Sparse Retrievers

A significant challenge in information retrieval today is determining the most efficient method for nearest-neighbor vector search, especially with the growing complexity of dense and sparse retrieval ...

marktechpost4d

Deep Learning Approach for Lithium-Ion Battery Life Prediction via Dual-Stream Vision Transformer

Predicting battery lifespan is difficult due to the nonlinear nature of capacity degradation and the uncertainty of operating conditions. As battery lifespan prediction is vital for the reliability ...

marktechpost4d

Collaborative Small Language Models for Finance: Meet The Mixture of Agents MoA Framework from Vanguard IMFS

Language model research has rapidly advanced, focusing on improving how models understand and process language, particularly in specialized fields like finance. Large Language Models (LLMs) have moved ...

marktechpost7d

OneGen: An AI Framework that Enables a Single LLM to Handle both Retrieval and Generation Simultaneously

A major challenge in the current deployment of Large Language Models (LLMs) is their inability to efficiently manage tasks that require both generation and retrieval of information. While LLMs excel ...

marktechpost4d

Agent Zero: A Dynamic Agentic Framework Leveraging the Operating System as a Tool for Task Completion

AI assistants have the drawback of being rigid, pre-programmed for specific tasks, and in need of more flexibility. The limited utility of these systems stems from their inability to learn and adapt ...

marktechpost4d

Learning and Knowledge Retrieval: A Comprehensive Framework for In-Context Learning in Large Language Models (LLMs)

Generative Large Language Models (LLMs) are capable of in-context learning (ICL), which is the process of learning from examples given within a prompt. However, research on the precise principles ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results