The content on this page was provided by an independent third party and syndicated by XPR Media. Members of the editorial and news staff of the USA TODAY Network were not involved in the creation of this content.

AIM Intelligence and BMW Group Examine Gaps in Evaluating Enterprise AI Policy Compliance

Research reveals LLMs follow allowlist policies but systematically fail to enforce organizational prohibitions, exposing a critical gap in enterprise AI safety

SF, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Seoul, South Korea / Munich, Germany – January 2026 – BMW Group and AIM Intelligence, a leading AI safety startup, today announced the publication of COMPASS (Company/Organization Policy Alignment Assessment), the first systematic framework for evaluating whether large language models (LLMs) comply with organization-specific policies. The research, now available on arXiv, reveals a critical gap that remains under-measured in current evaluation practices: models that pass standard safety benchmarks often fail dramatically when enforcing the nuanced, context-dependent rules that govern real-world business operations.

Why Enterprise AI Policies Break Down in Practice

As organizations across healthcare, finance, automotive, and government sectors rapidly adopt LLMs for customer-facing applications, the research team discovered a fundamental asymmetry that poses significant risks for policy-critical deployments.
Key Findings:
Strong Allowlist Compliance: Models reliably handle legitimate requests with over 95% accuracy
Critical Denylist Failures: Models fail to correctly refuse prohibited requests in up to 97% of cases
Catastrophic Adversarial Vulnerability: Under adversarial conditions, some models refuse fewer than 5% of policy-violating requests
“Most AI safety tests focus on whether a model behaves safely in general,” said Dasol Choi, AI Safety Researcher at AIM Intelligence. “COMPASS looks at a more practical question: can an AI system reliably follow the specific rules of an organization? Our findings show that, in many real-world deployments today, the answer is often no.”

Why Generic AI Safety Isn’t Enough

The research addresses a critical disconnect between how AI systems are evaluated and how they are deployed. While existing safety benchmarks focus on universal harms such as toxicity and violence, real enterprises operate under complex internal policies—compliance manuals, operational playbooks, legal edge cases, and brand-specific constraints.
COMPASS evaluates models across four dimensions that typical benchmarks ignore:
1. Policy Selection: Can the model identify which policy applies to a given situation?
2. Policy Interpretation: Can it reason through conditionals, exceptions, and vague clauses?
3. Conflict Resolution: When rules collide, does the model resolve conflicts as the organization intends?
4. Justification: Can the model ground its decisions in actual policy text?

“Our evaluation revealed a striking asymmetry,” noted DongGeon Lee, AI Safety Researcher at AIM Intelligence. “While models achieve near-perfect accuracy on what they can do, they remain structurally vulnerable in enforcing what they must not do. This gap persists across model scales and architectures, indicating that scaling alone cannot solve the problem.”

Industry-Scale Validation

The research team applied COMPASS across eight diverse industry scenarios—Automotive, Government, Financial, Healthcare, Travel, Telecom, Education, and Recruiting—generating and validating 5,920 queries that test both routine compliance and adversarial robustness. Fifteen state-of-the-art models were evaluated, including leading proprietary and open-source systems.

Making Misalignment Measurable

Perhaps the most significant contribution of COMPASS is transforming alignment from a philosophical concern into an engineering problem. The framework and benchmark datasets are publicly available on GitHub and Hugging Face, enabling organizations to evaluate their AI systems against their own policies.

About the Research Collaboration

This research represents a collaboration between AIM Intelligence, BMW Group, Yonsei University, Pohang University of Science and Technology, and Seoul National University. The full paper, “COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs,” is available at https://arxiv.org/abs/2601.01836.

About AIM Intelligence

AIM Intelligence is a Seoul-based AI safety company specializing in automated red-teaming, real-time guardrails, and AI monitoring solutions. Founded in 2024, AIM Intelligence serves major enterprises and conducts research across large language models, multimodal systems, autonomous agents, and emerging physical AI. The company has published over 15 research papers at top-tier conferences including ICML, ACL, NeurIPS, and IEEE.

Team Cookie Official
Team Cookie
email us here
Visit us on social media:
LinkedIn
Facebook

Legal Disclaimer:

EIN Presswire provides this news content “as is” without warranty of any kind. We do not accept any responsibility or liability
for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this
article. If you have any complaints or copyright issues related to this article, kindly contact the author above.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact pressreleases@xpr.media

Urim & Thummim: Seer Of Africa Launches A Sweeping New Series Where Prophecy And Truth Collide On The African Savannah

Urim & Thummim: Seer Of Africa Launches A Sweeping New Series Where Prophecy And Truth Collide On The African Savannah

John E. Black introduces sixteen year old Yohan Shakur, a reluctant Keeper whose visions pull him toward a destiny that

February 21, 2026

Dancing Through Diplomacy Reframes Education Around Agency, Compassion, And The Courage To Belong

Dancing Through Diplomacy Reframes Education Around Agency, Compassion, And The Courage To Belong

Akari Shinobu introduces a Learning Diplomacy Framework that challenges rigid mandates and invites educators to

February 21, 2026

California Interventional Pain Leader Spearheads Shift Toward Non-Cellular Repair Protocols

California Interventional Pain Leader Spearheads Shift Toward Non-Cellular Repair Protocols

Dr. Ripu Arora, MD, MBA, Founder of Arora Pain Clinic and CalSIPP Leader, Integrates Advanced RPA Technology to

February 21, 2026

Beary Landscaping Expands Snow & Ice Management Services Across Greater Chicago Area

Beary Landscaping Expands Snow & Ice Management Services Across Greater Chicago Area

LOCKPORT, IL – February 13, 2026 – PRESSADVANTAGE – Beary Landscaping has announced expansion of commercial snow and

February 21, 2026

Farwest Corrosion Control Celebrates 70 Years of Protecting Critical Infrastructure

Farwest Corrosion Control Celebrates 70 Years of Protecting Critical Infrastructure

Delivering comprehensive corrosion control solutions including products, engineering and construction across the U.S.

February 21, 2026

Antinol Plus and the Senior Dog Veterinary Society Announce Partnership to Elevate Joint Care for Senior Dogs

Antinol Plus and the Senior Dog Veterinary Society Announce Partnership to Elevate Joint Care for Senior Dogs

FORT LAUDERDALE, FL, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Antinol Plus, clinically proven joint and

February 21, 2026

S&L Remodeling And Design Provides Full-Service Home Remodeling in Walnut Creek, California

S&L Remodeling And Design Provides Full-Service Home Remodeling in Walnut Creek, California

Home Remodeling Walnut Creek, CA. S&L Remodeling And Design delivers kitchen, bathroom, garage conversion, and

February 21, 2026

The Jewelers Coin & Loan Co. Announces Silverware Buying Event Amid Record Silver Prices

The Jewelers Coin & Loan Co. Announces Silverware Buying Event Amid Record Silver Prices

A Great Time to Sell Sterling Silver BOSTON, MA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — The Jewelers

February 21, 2026

USA Cabinet Store LLC Responds to 2026 Houzz Kitchen Remodeling Trends with Enhanced Designs

USA Cabinet Store LLC Responds to 2026 Houzz Kitchen Remodeling Trends with Enhanced Designs

Annapolis, MD – February 12, 2026 – PRESSADVANTAGE – USA Cabinet Store LLC, a multi-state kitchen and bath renovation

February 21, 2026

La Banda el Limón llega al #1 en Chart General Monitor Latino en NICARAGUA

La Banda el Limón llega al #1 en Chart General Monitor Latino en NICARAGUA

"Team Anticupido" #7 en Chart General Monitor Latino USA y Top 10 en México & #16 en Billboard® LOS ANGELES, CA,

February 21, 2026

Rodent Activity Expands Beyond Coastal Cities, Exposing 1.2 Million Inland Californians to New Public Health Risks

Rodent Activity Expands Beyond Coastal Cities, Exposing 1.2 Million Inland Californians to New Public Health Risks

Rodent activity is expanding into inland California, exposing 1.2 million residents to new public health risks driven

February 21, 2026

Fenix Introduces LD30 RENEGADE: Compact Flashlight Delivering 1800 Lumens

Fenix Introduces LD30 RENEGADE: Compact Flashlight Delivering 1800 Lumens

The exclusive LD30 RENEGADE delivers 1800 lumens in a compact package with manual lockout and instant turbo activation.

February 21, 2026

Milberg Announces Investigation of CoreWeave, Inc. on Behalf of Investors Who Suffered Losses

Milberg Announces Investigation of CoreWeave, Inc. on Behalf of Investors Who Suffered Losses

Milberg, in partnership with Class Action U, is investigating potential securities law claims against CoreWeave, Inc.

February 21, 2026

Ottawa Infotainment Launches DragonFire University at the Toronto AutoShow

Ottawa Infotainment Launches DragonFire University at the Toronto AutoShow

New academic engagement program prepares the next generation of software defined vehicle talent The DragonFire

February 21, 2026

K. Hall Studio Acquires Homesick BVG, Expanding Portfolio of Lifestyle and Fragrance Brands

K. Hall Studio Acquires Homesick BVG, Expanding Portfolio of Lifestyle and Fragrance Brands

K. Hall Studio today announced the acquisition of Homesick BVG, the parent company of Homesick® SAINT LOUIS, MO, UNITED

February 21, 2026

Grade Timber Releases Free Downloadable Guide to Help Illinois Landowners Unlock the Value of Their Timber

Grade Timber Releases Free Downloadable Guide to Help Illinois Landowners Unlock the Value of Their Timber

Discover valuable Illinois trees like black walnut & white oak. Free sustainable tips, trends & tools!

February 21, 2026

Washington Roofing Services Announces Partnership with Malarkey Roofing Products

Washington Roofing Services Announces Partnership with Malarkey Roofing Products

Washington Roofing Services Earns Emerald Premium Certified Contractor Status, Expanding Access to High-Performance

February 21, 2026

CA Gubernatorial Candidate, Daniel Mercuri, Blasts CAGOP and Exits Republican Party in Third Run to Replace Gavin Newsom

CA Gubernatorial Candidate, Daniel Mercuri, Blasts CAGOP and Exits Republican Party in Third Run to Replace Gavin Newsom

U.S. Navy veteran and constitutional conservative Daniel Mercuri registers No Party Preference (NPP) ahead of the June

February 21, 2026

Nova Warranty Expands Vehicle Service Contract Options for Luxury and Exotic Automobiles

Nova Warranty Expands Vehicle Service Contract Options for Luxury and Exotic Automobiles

WILMINGTON, DE – February 12, 2026 – PRESSADVANTAGE – Nova Warranty, a provider of vehicle service contracts, announced

February 21, 2026

The Light System Sponsors the 39th Annual Leigh Steinberg Super Bowl Party Highlighting Athlete Wellness and Quality of Life After Football

The Light System Sponsors the 39th Annual Leigh Steinberg Super Bowl Party Highlighting Athlete Wellness and Quality of Life After Football

Miami, Florida – February 12, 2026 – PRESSADVANTAGE – The Light System (TLS), a light and frequency technology

February 21, 2026

Purple Mango Beauty Launches The Curl Trainer™ PRO Brush in North American Haircare Market

Purple Mango Beauty Launches The Curl Trainer™ PRO Brush in North American Haircare Market

A new beauty and hair care company, Purple Mango Beauty, is launching in North America and delivering the next

February 21, 2026

Lacey Asher Shares Her Journey of Resilience and Success on Legacy Makers TV

Lacey Asher Shares Her Journey of Resilience and Success on Legacy Makers TV

FL, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Lacey Asher, CEO of L2 Fit Pro LLC, is set to appear on

February 21, 2026

Business Owners, Entrepreneurs, And Experts Unite To Offer Education, Assistance In Free Business Growth Event

Business Owners, Entrepreneurs, And Experts Unite To Offer Education, Assistance In Free Business Growth Event

Free Business Tools, Resources And Education For Area Businesses To Thrive In 2026 This event is for business owners

February 21, 2026

Meriwest Community Foundation Celebrates Transformative Inaugural Year of Impact

Meriwest Community Foundation Celebrates Transformative Inaugural Year of Impact

The Meriwest Community Foundation is reflecting on a powerful inaugural year built on service, collaboration, and

February 21, 2026

NuRev Peptides Launches Bulk Wholesale Peptide Program for Businesses

NuRev Peptides Launches Bulk Wholesale Peptide Program for Businesses

New wholesale program offers volume-based pricing and dedicated business accounts for qualified research buyers. SIMI

February 21, 2026

Nonprofit Alliance of Consumer Advocates and Consumer Defense Law Group Secure Investor-Owned Loan Modification at 2%

Nonprofit Alliance of Consumer Advocates and Consumer Defense Law Group Secure Investor-Owned Loan Modification at 2%

ANAHEIM, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — The Nonprofit Alliance of Consumer Advocates

February 21, 2026

Awards TrophyWorld Presents Crystal Planet Award to Ronaldinho at Mar-A-Lago Gala

Awards TrophyWorld Presents Crystal Planet Award to Ronaldinho at Mar-A-Lago Gala

Awards TrophyWorld proudly provided a custom optic crystal award to help recognize Ronaldinho at the Hispanic

February 21, 2026

Striking + Strong Brings Wellness-Driven Textured Haircare to H-E-B Stores Across Texas

Striking + Strong Brings Wellness-Driven Textured Haircare to H-E-B Stores Across Texas

The clean textured haircare brand enters 40 H-E-B locations, marking a major milestone in its disciplined retail

February 21, 2026

Love Out Loud This Random Acts of Kindness Week

Love Out Loud This Random Acts of Kindness Week

Backed by science, kindness strengthens health, trust, and community from the inside out DENVER, CO, UNITED STATES,

February 21, 2026

Stanton University Named Top Private Institution for 2026 for its Commitment to Debt-Free Education and Public Service

Stanton University Named Top Private Institution for 2026 for its Commitment to Debt-Free Education and Public Service

ANAHEIM, CA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Stanton University has been distinguished as one of

February 21, 2026

Garage Door Spring Failures Are Rising Across North America

Garage Door Spring Failures Are Rising Across North America

Aging spring systems and higher daily garage use are driving increased replacements and repair costs across the U.S.

February 21, 2026

Europe Doors & Windows Market Size to Grow USD 75.70 Billion by 2031, Expanding at 3.45% CAGR | Arizton

Europe Doors & Windows Market Size to Grow USD 75.70 Billion by 2031, Expanding at 3.45% CAGR | Arizton

Germany, France, and the Nordics, policy incentives are accelerating the replacement of outdated windows with triple

February 21, 2026

Edo Launches Grid-Interactive Efficient Buildings Demonstration in New York

Edo Launches Grid-Interactive Efficient Buildings Demonstration in New York

SEATTLE, WA, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Edo today announced a new project to transform

February 21, 2026

Dual Diagnosis Treatment Orange County: How Asana Recovery Delivers Integrated Outpatient Care

Dual Diagnosis Treatment Orange County: How Asana Recovery Delivers Integrated Outpatient Care

Asana Recovery outlines its integrated outpatient approach to dual diagnosis treatment in Orange County, addressing

February 21, 2026

Michael Saile Named to The National Trial Lawyers Top 100 Civil Plaintiff Attorneys for 2026

Michael Saile Named to The National Trial Lawyers Top 100 Civil Plaintiff Attorneys for 2026

Michael Saile of Cordisco & Saile, has been named to The National Trial Lawyers Top 100 Civil Plaintiff Attorneys,

February 21, 2026

CORPUS OS UNIFIES SIX MAJOR AI FRAMEWORKS THROUGH OPEN SOURCE PROTOCOL SUITE

CORPUS OS UNIFIES SIX MAJOR AI FRAMEWORKS THROUGH OPEN SOURCE PROTOCOL SUITE

100% coverage. Six frameworks. Four domains. Corpus OS: first production-grade protocol for true interoperability

February 21, 2026

Ruth Klein Appears on Times Square Today to Discuss Soul-Centered Branding, Self-Belief, and Unlocking Personal Genius

Ruth Klein Appears on Times Square Today to Discuss Soul-Centered Branding, Self-Belief, and Unlocking Personal Genius

NEW YORK , NY, UNITED STATES, February 12, 2026 /EINPresswire.com/ — Ruth Klein, an expert in celebrity branding,

February 21, 2026

Curso de masaje ayurvédico en Bogotá, Colombia con reconocimiento internacional

Curso de masaje ayurvédico en Bogotá, Colombia con reconocimiento internacional

La Escuela de Ayurveda de Calfornia en anuncia un curso presencial en español de masaje ayurvédico Abhyanga en Bogotá,

February 21, 2026

Get Out of Debt Guy Launches Free Anonymous Tool for People Too Ashamed to Talk About Debt

Get Out of Debt Guy Launches Free Anonymous Tool for People Too Ashamed to Talk About Debt

The Debt Confessional applies decades of expressive writing research to financial shame — no email, no account, no

February 21, 2026

Ember Group Consulting Recognized by Consulting Magazine with Firm and Individual Honors

Ember Group Consulting Recognized by Consulting Magazine with Firm and Individual Honors

CHARLOTTE, NC, UNITED STATES, February 12, 2026 /EINPresswire.com/ — PRESS RELEASE FOR IMMEDIATE RELEASE Ember Group

February 21, 2026