Measuring Emotional Intelligence and why it might matter for human machine interfaces; What I read last week
I was an educator in my past life. The methodology, i.e., science, pedagogy, and philosophy, behind designing and executing assessment of / evaluations for humans, and now machines (without anthropomorphizing!), has been a topic of intellectual interest and professional work. Most educators agree that we’ve made many mistakes in how we measure human intelligence. Through my blogs I’m following my curiosity to understand the intersections of my worlds as an education entrepreneur and now as someone focused on AI evaluations and governance.
1. Measuring Emotional Intelligence (EI)
Details below are from an article titled “Human Abilities: Emotional Intelligence,” authored by John D. Mayer, Richard D. Roberts, and Sigal G. Barsade, which was published in the Annual Review of Psychology in 2008.
What is EI?
EI is an Ability: EI is defined as the ability to carry out accurate reasoning about emotions and the capacity to use emotions and emotional knowledge to enhance thought.
Hierarchical Structure: EI is viewed as fitting within a hierarchy of mental abilities, similar to other forms of intelligence (like verbal or perceptual-organizational intelligence).
Approaches to measuring EI
Scientific literature on EI uses three primary theoretical approaches: the Specific-Ability approach, the Integrative-Model approach, and the Mixed-Model approach
Specific-Ability approaches focus on individual mental capacities that are important to EI e.g., emotional perception and identification is measured by Diagnostic Analysis of Nonverbal Accuracy Scales (DANVA) and the Japanese and Caucasian Brief Affect Recognition Test (JACBART).
Integrative-Model approaches regard EI as a cohesive, global ability. For example Izard’s Emotional Knowledge Approach uses Emotional Knowledge Test (EKT), which integrates multiple specific abilities. The test might ask participants to match an emotion (like sadness) with a relevant situation (such as “your best friend moves away”) and to identify emotions in faces, focusing on both emotional perception and understanding.
Mixed-Model approaches utilize very broad definitions of EI, which include diverse psychological traits, abilities, styles, and other characteristics. Mixed-Model scales are typically operationalized using self-judgment scales (self-report inventories), such as the Emotional Quotient Inventory (EQ-i).
Validity concerns of different approaches to measuring EI
Lack of Convergence for “Specific Ability Approach”: A serious concern regarding the validity of Specific-Ability measures is that different scales designed to measure the same concept, e.g., accurate emotional perception, often do not correlate highly with each other. Correlations among measures of perceiving interpersonal affect are also low. The lack of convergence among measures in the emotional perception domain is not well understood.
Self-judgement challenge for “Mixed Model”: Mixed Models rely on self-judgment scales (self-report inventories) which is conceptually invalid for assessing a mental ability because it assesses a person’s self-estimated capacity rather than their actual ability to solve emotional problems.
Overlap with other traits
Overlap with personality: Mixed-Model measures show substantial empirical overlap with some established personality traits, particularly correlating highly with lower Neuroticism. Ability measures show only low to moderate correlations with personality (highest with Agreeableness and Openness).
Relationship to Cognitive Intelligence: Ability-based EI measures index emotional knowledge and show moderate correlations with verbal/crystallized intelligence, confirming that EI is related to but distinct from verbal-comprehension and other forms of intelligence.
Potential implications for HCI? disclaimer: I am not an HCI expert :)
In human EI, the goal is to understand, regulate, and use emotions effectively in social contexts. The analog for HCI is whether a machine can perceive that human emotion and respond or take action to meet the emotion.
Measurement Approaches if Applied to HCI
Specific-Ability Approach i.e., Task-specific evaluations:
How accurately can a system detect user frustration or any other relevant emotion in speech and text?
Can it recognize facial expressions, gestures, typing patterns, typos etc.?
This would be relevant in usability testing, adaptive UIs etc.
Integrative-Model Approach i.e., Multi-modal interaction evaluations:
Does the system integrate multiple cues (speech tone and language, facial expression, task behavior) and correspondingly adjust its feedback or empathy responses?
This would be valuable for adaptive tutoring systems, healthcare applications, or assistive agents.
Mixed-Model Approach i.e., Broad interactional evaluations:
Is the system able to interact with the user to ask questions for the user to reveal their own emotion, frustration, excitement etc.?
This would be valuable for customer service bots, caregiving robots, or digital companions.
Risks, reliability and others considerations for human-machine interfaces
Evaluations for high-risk AI applications: Deep research will need to be conducted to understand not only whether machine interfaces understand emotions but also whether the response is appropriate and not provocative.
Trust: If the system responds in a manner that provoke specific emotions for the user, there must be clear and transparent communication about the implications of using such AI applications e.g., in healthcare or education.
Persuasive AI: High EI-like interactions such as in healthcare, digital companion or education can manipulate emotions (nudging, persuasion, or dependency) and clear standards are needed to avoid exploitation.
Cultural Sensitivity: Systems must adapt to cultural variations in expressing and interpreting emotion to avoid misalignment.
2. What I read last week
What Lessons Can We Learn from the Internet for AI/ML Evolution?
What I found interesting: Failure is sometimes expected, but the system is designed to recover gracefully. “The Internet was designed to survive failures in terms of packet losses, congestion, or outages. AI systems, on the other hand, often fail unpredictably, due to hallucinations, brittle reasoning, or collapse under adversarial inputs. We need an AI equivalent of TCP’s retransmissions and congestion control, with backup agents, graceful degradation, and self-healing protocols that ensure reliability at scale.”
The Era of Real-World Human Interaction: RL from User Conversations
The text introduces a new paradigm for aligning language models called Reinforcement Learning from Human Interaction (RLHI), which focuses on learning directly from in-the-wild user conversations rather than expert-curated feedback. Approach develops two complementary methods:
RLHI with User-Guided Rewrites to revise unsatisfactory outputs based on natural language follow-ups
RLHI with User-Based Rewards which uses a reward model conditioned on a user’s long-term history, or persona
The paper implies that training models using organic human interaction data significantly improves personalization and advocates for a shift toward using the dynamic, contextual feedback inherent in real-world use to achieve continual model improvement and better alignment with diverse user preferences.
Source: https://arxiv.org/pdf/2509.25137v1
Views are personal and should not be attributed to my current or past employer(s). Tools used in assisting the writing this blog post: NotebookLM, GPT-5, recommendations from Scouts by Yutori.

