Skip to content

refactor(cdk/a11y): FocusMonitor now uses InputModalityDetector under the hood. #22489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 26, 2021

Conversation

zelliott
Copy link
Collaborator

@zelliott zelliott commented Apr 15, 2021

Demo: https://angular-ivy-ls9hgx.stackblitz.io/#

InputModality changes:

  • Added InputModalityDetector.modalityDetected that emits whenever an input modality is detected. This is in contrast to the existing InputModalityDetector.modalityChanged, which emits whenever the input modality changes. This method is specifically needed for FocusMonitor, but presumably it could be useful to any user. We could also make this an internal-only API potentially? When FocusMonitor is in IMMEDIATE mode, it attempts to associate the focus event with the input modality event in the current/previous tick (note that this isn't entirely true for touch interactions). This means we need to update InputModalityDetector to expose all modality detections, as opposed to just when the modality changes). Otherwise, the modality that FocusMonitor receives from the InputModalityDetector could be stale.

  • Previously, InputModalityDetector ignored fake mousedown and touchstart events, now it detects them as keyboard input modality. It did this because I mistakenly thought these events where emitted while linearly navigating / in browse mode with a screen reader. It turns out they're emitted when activating a control with a screen reader (e.g. VO + Space to activate a button), in which case we probably do want to detect keyboard input modality. We'd want to make this change regardless of whether this PR goes in.

FocusMonitor changes:

Regression when touch events are quickly followed by programmatic focus events: Previously, when a touch event was followed by a focus event, we'd use the _wasCausedByTouch method to determine if the focus event should be attributed to touch. This method attributes the event to touch if (1) the focus event occurred < 650ms after the touch event and (2) the touch event target is contained within the focus event target (or the targets are the same). It had some issues (see the huge comment in the method itself) but as a heuristic apparently worked well enough.

Now, we only attribute the event to touch if the former is true. The reason we can't perform the latter logic is because we no longer have the touch event itself (as it's not exposed by InputModalityDetector). What this means in practice is that in scenarios where the user touches an element, and focus is programmatically moved as a result of that touch, that focus will be more often mis-attributed as touch instead of program.

At the end of the day, the core problem is that the IMMEDIATE mode simply doesn't really work with touch events. Per the comment in FocusMonitor:

/**
 * Any mousedown, keydown, or touchstart event that happened in the previous
 * tick or the current tick will be used to assign a focus event's origin (to
 * either mouse, keyboard, or touch). This is the default option.
 */
IMMEDIATE,

This is inaccurate for how touch events are handled. We're forced to wait 650 ms to attribute the touch event. I'm not really sure what the best way to solve this is, some options below.

  • Do nothing, just live with the slight regression. It's easy for developers to work around, they just need to use FocusMonitor.focusVia(el, 'program') instead of .focus() directly.
  • InputModalityDetector exposes the underlying events themselves so that FocusMonitor can perform its current logic.
  • We change the behavior of IMMEDIATE to instead be within the last N ms, where N >= 650 and is the same regardless of keyboard/touch/mouse.

Note: The source itself suggests another solution, copied below:

// If we decide that we absolutely must handle this case correctly, we can do so by listening
// for the first focus event after the touchstart, and then the first blur event after that
// focus event. When that blur event fires we know that whatever follows is not a result of the
// touchstart.

However, this doesn't work for the case where a non-focusable element is clicked, and upon click, programmatically moves focus to a different element. No focus/blur events will be fired for the first interaction.

@google-cla google-cla bot added the cla: yes PR author has agreed to Google's Contributor License Agreement label Apr 15, 2021
@zelliott zelliott requested review from crisbeto and mmalerba April 15, 2021 16:33
@zelliott zelliott requested a review from andrewseguin as a code owner April 15, 2021 17:08
@@ -728,6 +728,9 @@ describe('MDC-based MatTooltip', () => {
tick(500);

expect(overlayContainerElement.querySelector('.mat-mdc-tooltip')).toBeNull();

// Flush due to the additional tick that is necessary for the FocusMonitor.
flush();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These flush calls are because this test only ticks 500 ms, but the FocusMonitor creates a timer for 650 ms (TOUCH_BUFFER_MS) for touch events. Ticking for 650 ms, or adding this flush, allows the pending timer to complete. Previously this timer was only created for touchstart, not focusVia(el, 'touch'), now it's created for both (to match the behavior of keyboard and mouse).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that the last time we did something like this in the FocusMonitor, it was difficult to land in g3, because it broke tests using fakeAsync.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I'll see what things look like in g3 and rework things if needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the change in g3 and things fortunately look good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: These flush calls ended up being removed to keep FocusMonitor's existing behavior and because ultimately not clearing the origin after 650 ms when using focusVia is unnecessary (see #22489 (comment)).

Copy link
Member

@crisbeto crisbeto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the touch issue, maybe we should just expose the last event target?

@@ -728,6 +728,9 @@ describe('MDC-based MatTooltip', () => {
tick(500);

expect(overlayContainerElement.querySelector('.mat-mdc-tooltip')).toBeNull();

// Flush due to the additional tick that is necessary for the FocusMonitor.
flush();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth noting that the last time we did something like this in the FocusMonitor, it was difficult to land in g3, because it broke tests using fakeAsync.

@zelliott
Copy link
Collaborator Author

Regarding the touch issue, maybe we should just expose the last event target?

I think that's a better idea than exposing all events, still not quite perfect, but might be our best option. Let's see what @jelbourn and @mmalerba think as well.

// window blurred.
// 2) The element was programmatically focused, and thus we set the origin as 'program'.
// 3) The element was focused via navigating with a screen reader, and thus we set the origin
// as 'program' because keyboard events are rarely fired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't totally follow the part here about keyboard events being rarely fired. Wouldn't screen reader navigation almost always be keyboard navigation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When navigating with JAWS/NVDA/VO, no keyboard events are fired (for the most part). For example, if a user is linearly navigating with VO via Caps Lock + Arrow Keys, no keyboard events are fired despite focus moving along with VO's virtual cursor. As another example, if a user is focused on element A, then uses JAWS' browse mode to navigate to element B (note focus doesn't yet move in this case), and then presses Enter to enter forms mode and move focus to B, again, no keyboard events are fired despite focus moving.

In these scenarios, we can't detect the input modality as keyboard, and as a result, any subsequent focus change will be attributed to program. Note that this is FocusMonitor's existing behavior today, it just isn't documented anywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it- I knew screen reader interaction didn't kick off keyboard events, I just misunderstood the comment (the point being that you can't distinguish between programatic and screen-reader). I might reword slightly:

// If the window has just regained focus, we can restore the most recent origin
// from before the window blurred. Otherwise, we've reached the point where
// we can't identify the source of the focus. This typically means one of two things
// happened:
// 1) The element was programmatically focused, or
// 2) The element was focused via screen reader navigation (which
// generally doesn't fire events).
// Because we can't distinguish between these two cases, we default to
// setting `program`, erring on the side of not showing keyboard focus indicators.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (but omitted the final piece "erring on the side of not showing keyboard focus indicators" because FocusMonitor doesn't know about keyboard focus indicators, and can be used with or without them).

this._originTimeoutId = setTimeout(() => this._origin = null, 1);
clearTimeout(this._originTimeoutId);
const ms = (origin === 'touch') ? TOUCH_BUFFER_MS : 1;
this._originTimeoutId = setTimeout(() => this._origin = null, ms);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, in cases when we could attribute touch immediately, we would have done so, right? With this change, it looks like touch is always delayed by 650ms?

Copy link
Collaborator Author

@zelliott zelliott Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand you correctly, that's almost correct. Previously, in the case where we could attribute touch immediately (i.e. via focusVia(el, 'touch'), we'd immediately update the origin and reset it at the start of the next tick. Now, we still immediately update the origin, but we reset it 650ms later, instead of at the start of the next tick. Thus, this change does slightly change the behavior of focusVia in that regard. It's easy to fix this, if we want, I think it'd be easy to do. Just want to make sure I understand where you're coming from before changing anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just trying to make sure I understood (which I did not). Now I follow, I think this makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I added a isFromInteractionEvent parameter to this method that allows us to distinguish from when it is called from focusVia versus from the InputModalityService. Please LMK if you have a better way of distinguishing between the two.

@zelliott zelliott requested review from jelbourn and crisbeto April 20, 2021 23:25
@mmalerba
Copy link
Contributor

re: what to do about the touch regression

I'm leaning toward just having a slight regression (assuming its not too hard to get in to g3). I would guess in 99% of cases it makes no difference. I don't think any of our components apply special styling for touch events, and I doubt that's a common use case many people have.

If it turns out I'm wrong and everyone starts demanding we go back to the old behavior we can always add an internal API to expose the last event and use that in FocusMonitor

Copy link
Member

@jelbourn jelbourn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also okay with the touch behavior, I think we can revisit if we get reports of issues

Overall LGTM

@jelbourn jelbourn added target: feature This PR is targeted for a feature branch (outside of main and semver branches) Accessibility This issue is related to accessibility (a11y) G This is is related to a Google internal issue P2 The issue is important to a large percentage of users, with a workaround merge safe labels Apr 21, 2021
@jelbourn
Copy link
Member

@zelliott you can add the "merge ready" label when ready to merge

@zelliott zelliott added the action: merge The PR is ready for merge by the caretaker label Apr 24, 2021
@zelliott
Copy link
Collaborator Author

Okay, should be ready to merge now. I made a few nit additional changes, please see the latest commit. I tested both the InputModalityDetector and FocusMonitor thoroughly on various combos of browsers/screen readers/systems, here's a short list of the minor bugs / known issues that I found (none of which I think are blocking):

InputModalityDetector

  1. Non-screen reader issues:
    1. [Chrome] [Safari] Right clicking after an initial right click is detected as keyboard. It should be mouse. Referenced in bug(cdk/a11y): [Safari] [Chrome] [MacOS] FocusMonitor improperly identifies subsequent right mouse clicks as keyboard focus origin #22549. Non-blocking because low severity.
    2. [Chrome] [Safari] [MacOS] Pressing Caps lock off is not detected. It should be detected as keyboard. No keydown event is fired, see https://stackoverflow.com/questions/39016292/keydown-event-is-not-fired-for-capslock-in-mac. If we care about fixing this, we should also listen to keyup specifically for this key (may not be worth the additional global event listener?). Non-blocking because low severity.
    3. [Firefox] [MacOS] Pressing Caps lock (on or off) is occasionally not detected. It should be detected as keyboard. See bullet above. Non-blocking because low severity.
    4. [Safari] Pressing right Meta (command) is detected as keyboard. It should be ignored by default. It's not possible to ignore it, however, because it has the same event.keyCode as ContextMenu. When we switch to event.key, this will be automatically resolved. Non-blocking because low severity.
  2. Screen reader issues:
    1. [VO] [Chrome] Performing a left mouse click with VO + Shift + Space is detected as keyboard. I think it should probably be mouse. Referenced in bug(cdk/a11y): [Safari] [Chrome] [MacOS] FocusMonitor improperly identifies subsequent right mouse clicks as keyboard focus origin #22549. Probably not Chrome-specific. Non-blocking because low severity.
    2. [NVDA] [Chrome] Triggering NVDA "Activate" on a touchscreen with double tap is detected as keyboard. It should be touch. Our hands seem to be tied here though because we can't distinguish between NVDA "Activate" via double tap vs NVDA + Enter (both send fake mousedown events). Ultimately maybe NVDA should be firing a fake touchstart event here? Probably not Chrome-specific. Non-blocking because low severity.
    3. [JAWS] [Chrome] Activating left mouse button with Caps Lock + 8 is sometimes detected as mouse and sometimes as keyboard. It should be always keyboard. This is because sometimes JAWS sets event.buttons to 0 and sometimes to 1. There's again nothing we can really do here. Probably not Chrome-specific. Non-blocking because low severity.

Note: The three screen reader issues above all have to do with interacting with the web page by triggering activation behavior. I added a note to the InputModalityDetector that stated that when a user is interacting with a SR, the service's behavior is largely undefined (due to the issues above).

FocusMonitor (in IMMEDIATE mode)

Note: Some of the issues reported above in InputModalityDetector may effect FocusMonitor and may cause bugs, but I've omitted them because they're already tracked above.

  1. Non-screen reader issues:
    1. Moving focus via HTMLElement.focus on mousedown, keydown, or touchstart (e.g. pressing enter on a button that calls HTMLElement.focus on some other element) is attributed as the origin corresponding to the modality that originally triggered the focus. It should be program (unless I'm misunderstanding FocusMonitor's attribution). Non-blocking because existing FocusMonitor bug.
    2. Using the mouse to drag text into an <input />, causing the input to take focus on mouse up, is attributed as program. It should be mouse. We could fix this by listening for focus event between dragstart and dragend. Non-blocking because existing FocusMonitor bug.
    3. Long press on touchscreen that moves focus is sometimes attributed as program and sometimes as touch. It should always be touch. From my testing, long pressing on elements with carets (i.e. like <input type="text"> works fine) while long pressing on focusable elements without carets (i.e. <button>) does not work. The problem seems to be that the focus event is not fired until after touchend in these cases, so the "wait 650ms" heuristic doesn't work. Non-blocking because existing FocusMonitor bug.
    4. [Chrome] [Firefox] Find to move focus is attributed to the last origin. I'd potentially expect it to return whatever origin corresponds to the modality that was used to close the find panel (i.e. mouse if the close button was clicked, keyboard if Escape was pressed). This isn't really a bug as much as it is an edge case - what's happening here is that the window is losing focus when the find menu is opened. There's nothing really we can do here either. Non-blocking because existing behavior and also likely not a bug.
    5. [Safari] Find to move focus has mostly the same behavior as the browsers above, but has some confusing edge cases I couldn't quite get to the bottom of. For example, if you go to https://angular-ivy-8ehzqo.stackblitz.io/#, press Command + F to open the find menu, type "keydown", press Escape, press Command + F again, and then press Escape again, you'll note that the origin was attributed as program. It appears as though Safari can potentially move focus before the window is blurred. There may be a fix here, but it's a very nit issue. Non-blocking because existing behavior.
    6. Clicking on a label that moves focus to an associated form control is attributed as program. I think this is actually expected behavior, it's just an edge case to be aware of. Not-blocking because not really a bug.

@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators May 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Accessibility This issue is related to accessibility (a11y) action: merge The PR is ready for merge by the caretaker cla: yes PR author has agreed to Google's Contributor License Agreement G This is is related to a Google internal issue P2 The issue is important to a large percentage of users, with a workaround target: feature This PR is targeted for a feature branch (outside of main and semver branches)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants