What to expect from this article:
- Thoughts on multiplatform (or cross-platform) design.
- Methodology of gorilla usability tests conducted at DroidCon NYC 2019.
- Discussion and visualization of usability test results
- Epic Conclusions
DroidConNYC 2019. 2 days dedicated to Android developers and their frontier. 70+ technical Android talks. Thought leadership from around the world. Hundreds of android, iOS or multiplatform developers of varying degrees of expertise buzz around the conference, talking, listening, upskilling.
Surveying the crowd, I can’t help but notice the occasional android developer playing on her iPhone. And vice versa. In fact, I saw quite a few people at this niche conference toting a mixed bag of devices: iPhones, Pixels, Macbooks, etc. BTW “toting” is a fun conference pun because of all the freebie tote bags ;).
The casual comingling of the two reigning platforms coupled with this years’ focus on 1. multiplatform development and 2. “Kotlin/Everywhere” begs our question:
How should our design team approach multi-platform design?
As experts in KMP development and pioneers on that frontier, my coworker-devs at Touchlab know that the question of design must be approached carefully. Thus far, we’ve been fans of the prudency of design agnosticism inherit to kotlin multiplatform. This conference was a perfect kickoff point for me to put some tendrils out into the crowd and gather some data.
Design parity is something that Touchlab has been thinking about for a while, especially since reading Linzi Berry’s article on Design Parity.
If we only build for one platform, let’s say iOS, then we’re not delivering a consistent product, reducing time and debt or raising the quality of experiences for every person for Android or Web. (Linzi Berry)
Linzi’s team had a problem:
“What happened was we started to get some native iOS paradigms, like centered headers and chevrons at the end of list items, being introduced into Android where they weren’t native — which is confusing to our end users.”
Ultimately, their team moved towards parity in order to treat users equally, “promote consistent designs and behaviors” and move towards a consistent design system.
This got us head-nodding, but also asking…are non-native design components actually confusing to end users? How confusing? Will we see quantifiable effects?
My quest: Use existing iOS and Android apps in simple usability tests to gauge users’ reactions to native or non-native design elements*.
Native is defined here as “native to the users’ platform guidelines as indicated in Google’s Material Design or Apple’s Human Interface Guidelines.”
Stop. ✋Rewind. ⏪ If it ain’t broke, don’t fix it. Right?
Thoughts on multi-platform design.
Google Material Design and Apple’s HIG has allowed us to more easily design for “intermediate users” (AKA the majority) by bypassing beginner/intermediate learning curves and upping confidence with domesticated design elements that lead to standard UX.
With this in mind, I’m thinking:
Do modern users of 2019/2020+ still need native design elements to confidently complete in-app tasks?
Hear me out: not only do humans spend countless hours on their phones internalizing platform nuances whether they want to or not, but we also spend hours in-app, experiencing and internalizing nested nuances.
Anyone who uses a digital product for hours a day will very quickly internalize the nuances of its interface. It isn’t so much that they want to cram frequently used commands into their heads, as much as it is unavoidable. Their frequency of use both justifies and requires the memorization. -About Face.
From https://www.businessofapps.com/data/app-statistics/ I know that:
- On average, US users use apps for 2 hours 57 minutes per day, out of 3 hours 43 minutes mobile time
- US users spent 77% of app usage time using their three top apps, on average
- US users use an estimated 20.1 apps per month in 2019
That’s a lot of app-time. That’s potentially quite a lot of exposure to app-specific UX nuances. That’s exposure that could increase users’ tolerance for variance from platform guidelines. That’s variance that might just be internalized and learned and reinforced.
Arguably, this could mean that users might be more comfortable with idioms and rely much less on compounds.
Also arguably, the line between idioms and compounds has been blurred.
I’ll use myself as a counter example because I have my “social networking” screentime limited to 30 minutes on my phone.
My top 3 apps on my iPhone (besides messages…) are:
When I’m in spotify and I click the “meatball” icon (or the more_vert icon) i darn well better see a full screen overlay giving me the option to share the song or go to the radio. (These are tasks I do often.)
But, when I use instagram, I am still sometimes perplexed at its architecture.
While I can easily browse my old photos or add a story, how do I allow someone to follow me, again? (This is a task I do rarely because I am lame.)
As for pinterest, well, if I get over the embarrassment of it being a top 3 app, I can tell you that I expect and am pleased by the full-screen, ultra-dynamic cool, neat, interactive search screen. I can also tell you that I just now guessed that this questionable icon:
would lead me to settings, purely because the placement of the settings screen(s) was architecturally similar to instragram and spotify. I can tell you, too, that I’d never visited my pinterest settings and probably never will again. Bye.
All of this to say…
“Both beginners and experts tend over time to gravitate towards intermediacy.” — About Face
Even if an app is meant to be habitual or part of a daily routine and is therefore used fairly frequently, arguably we should be prudent and design for real humans who will inevitably slip back to intermediacy.
Many mobile apps have a inherit “transient” posture, meaning that the user’s interaction with the app is brief, task-oriented and when in-use, needs to work within a range of expectation that the user might not even be aware of.
Transient users still internalize nuances and accidentally memorize interfaces, BUT because usage is isolated, potentially sporadic and apps DO change…
Any help, comfort, confidence and implicit guidance that consistent Material Design and HIG can hand to the transient user is appreciated…right?
All you need in this life is ignorance and confidence, and then success is sure. (Mr. Mark Twain)
Another consideration, just to complicate things even more 🙂 – there are many mobile apps that are just one of many touch points throughout a brand experience. (Or plan to be that way soon.) More on that in a later article, but for now, let’s acknowledge that there are many arguments out there that advocate for consistency across interfaces.
With these (many) considerations bouncing around in my brain, I set out to test ideas myself at DroidCon NYC.
Why? To put iOS and Android versions of apps in front of users and see if they could tell the difference, or feel something was off, or demonstrate confusion or… anything really! #UXLife
Curiosity about apps that have multiple touchpoints (yet are likely still rather transient) drove my decision to use both the iOS and Android versions of the following 3 apps in usability tests:
United Airlines, National Geographic & Headspace.
Why United: Recently ~redesigned~ for “interactivity,” specifically when in an airport or IN-flight. Also, a Webby winner for App “Best Practices.” I would dare to say that it has VERY high potential for use across devices: web, phone, tablet, kiosk, etc.
Why Headspace: Unique design, strong brand, and pretty darn similar across devices, with some minor exceptions. Positive habits are encouraged here, which often means coming back to the app and becomes an expert user. Alos, Some potential for use across devices. Mobile, tablet, web, etc. Not sure about actual numbers there.
Why National Geographic: People’s Choice Webby-winning for UX. Theoretically high potential for use across devices. Magazine, web, ipad, app, etc. **I’m not going to talk about Nat Geo in this article, after consideration I think that the results were skewed by tester error.
Let’s talk methodology.
TLDR 🤗; I made sure users didn’t know if they were looking at iOS or Android, put together some interactive prototypes and asked users to complete very simple tasks that exposed them to native, custom or hodge-podge UI components.
- I identified ~easy-to-complete~ tasks on Headspace, Nat Geo, and United that would expose the users to respective native platforms design elements and/or “custom” design elements. (See picture above). I wasn’t testing the task flows of these established apps, I watching reactions.
- I created 6 different interactive prototypes using Adobe XD: Android Headspace, iOS Headspace, Android Nat Geo, iOS Nat Geo, Android United, iOS United. I simply screen-grabbed from the actual apps using a Pixel 3 and an iPhone 8 and threw them into XD. I❤ XD
- I got rid of the iOS and Android status bars, as well as the Android bottom nav in XD. I just…cropped them out of my artboards :). Bye, bye dead-giveaways! I also photoshopped out “siri” and “google assistant” on the headspace setting screens just to be thorough and resized the screens to be about to same size, albeit inaccurate.
- I used a horizontally rotated iPad and XD cloud to make sure the prototypes were not only disassociated from their iOS/Android mothers, but also functional, swipeable, tapeable, etc. (I still had some issues at the actual conference but, I tried.)
- I broke the prototypes into two sets of tests: Test A, with 2 android apps and one ios app, and Test B, with 2 ios apps and one android app. I did not want the users to be able to compare screens directly or by memory; during an ad-hoc A/B session where users were allowed to compare, it quickly devolved into a “who’s a better dev” puzzle game. No thank you. Theoretically, I would show Test A (with two android apps) to anyone who indicated iOS usage/exposure. That was just a quick way to decide on the fly.
- Conference time! Armed with a Touchlab T Shirt (Stranger danger!), the iPad and my user tasks scripted out on my laptop, I gorilla-attacked 10 unsuspecting souls for my test. I targeted people with iPhones, Macbooks, or anyone sitting around looking bored. #doyouhave5minutes. I took notes on my laptop as we went through the tests together.
Let’s talk results 🤔😬👍😤🙂:
Wait, not like that ^ 👍😤🙂:
I tested 10 people (P1-P10) at the conference, listed here in order of expertise and exposure to mobile iOS and Android:
Most of the users had development experience (as expected at this …dev…conference) which is indicated as volume above the dotted line. Overlap (in green) connotes a comingling of platforms- whether in personal devices or in a professional capacity, or both.
It was important for me to note this overlap in order to consider how much cross-exposure the user could be accustomed to.
Headspace Usability Tests
There are few differences between Headspace for iOS and Headspace for Android: the back buttons differ within the settings, iOS has chevrons in its’ settings, spacing feels a bit different throughout. You can see the screens below. When compared, I think the two apps have achieved design parity:
So how has opting for design parity (whether strategically or not) affected the experience of first-time users?
A whopping 7 of 10 users had specific complaints related to UX or UI:
5 of those 7 users were looking at the opposite device prototype from “their” platform.
Since these two prototypes were essentially the same, what the heck does that mean? The most common thing that folks complained about was the location of the settings menu: it WAS in the same place for both platforms, and looked the same.
6 users called out UI components. (What does “called out” mean? It’s a comment about the UI, positive or negative, that, theoretically, fueled their guess at what prototype they were looking at.)
But, noticing specific UI components did NOT mean that they were correct in their final identification.
These “bottom 4” users (above) noticed the visual design, felt a difference in the visual design, but it did not point them to an answer of which platform they were looking at.
Parity was successful(?) in that regard.
2 users (p2, p4: lower level expertise, then) pointed out the guideline-specific back buttons within the settings; they actually ended up guessing the opposite platform even though they saw “their” platform. Unexpected, as it seems to me like those back buttons are “tells.” Perhaps they aren’t dead giveaways after all…
iOS screens were correctly identified correctly 3 times, while Android screens were identified correctly only once.
(And he user (P8) who correctly identified the android screens was a high-expertise android user with multiplatform exposure.🧠)
Why was iOS correctly identified when the designs are so similar?
Chevrons, after all, ARE used on both iOS and Android search screens. And remember, i Photoshopped out Siri and Google Fit from the settings.
4 of 5 iOS users identified their prototype as iOS- no matter what it actually was.
Is this an indication of bias? Or is this an indication of a level of comfort, or, well, tolerance to the design elements that might seem familiar to them?
Sidenote: When I asked some expert android devs that I work with if the app seemed, “iOS-y.” They didn’t seem to think so.
In summary, well, its a mixed bag:
Despite design parity, people who tested the opposite platform from “their” platform were more likely to hesitate or “mess up,” even if they didn’t think it was the opposite platform.
Importantly though, it is not unlikely that these users were predisposed to notice these things and/or were primed to call them out because they were being tested. It seems that opting for design parity has indeed blurred the lines between the two platforms since less than half of users could correctly identify the platform that they were seeing. However, iOS users were more likely to be correct, which could be an indication of bias or even an indication of inherit recognition.
It’s interesting that so many people did call out UI elements. Perhaps unique UI elements are by definition simply distracting. All in all, there were quite a few UI callouts, UX gaffes, and task failures.
I do not envy the information architecture questions that inevitably went into the build of United mobile.
There is a lot going on on the United App. A lot.
Wheelchair access, lost bags, checking in, flight status, booking a flight, rebooking a flight, gate changes, airport maps, in-flight considerations. The list goes on.
I chose to focus on booking a flight and navigating the architecture to find your “profile.” Why? Because booking a flight seems like a pretty darn critical task, and finding your profile seems like a task that even frequent users might not do very often on this app. (Assumption alert!)
Before I go on about bad UX, I do want to point out that I could not and did not test the shiny features of the app, which focus on the airport experience. From a glance, the UA app creators did a great job understanding the service as a whole and delivering on an more pleasant “i already have a ticket” experience.
United Airlines Usability Tests
Conducting Usability tests for the United app proved to be pretty interesting; this is a huge, omni-channel brand with experiences (booking and flying and more) that can be stressful, confusing and complicated.
Let’s go ahead and admit that what the users saw is not GREAT design.
Spoiler: Users agreed- commenting on the vastness of the catch-all hamburger as well as verbally indicating that it was “overwhelming,’ or “too much” or, “wow…”.
Looking at the screens, it seems United has opted for a platform-native approach for iOS, probably in order to minimize confusion around multi-step tasks.
(Also probably because they prioritized iOS. Just guessing.)
Seems like a hodge-podge of native and randomness. What’s going on with that square toggle? Why the iOS action sheet copycat?
But I also see small design decisions that could be in an effort to achieve parity across touchpoints- not necessarily across platform. For instance, the home page and the radius of the dark-blue CTA.
What I wanted to know is this:
1 Did platform-native UI help iOS users navigate multi-step tasks? TLDR: Not really, but the standardization of the UI elements had what seems to be a positive effect on UX for both iOS and Android users.
2. Did iOS users feel less overwhelmed or comfortable with stressful tasks if they were using “their” platform? TLDR: Nope. Bad design.
3. Did Android users struggle more than iOS users? TLDR: I think so.
First things first, 8 people had a UX complaint or problem.
5 of those people were looking at the opposite platform. All 5 of those people also complained about the UI.
(The complaints were all over the place: perhaps a comment about the style of the hamburger, the existence of the hamburger, labels, the back button, icon safe space.)
Every single user who saw iOS complained about the UX; Every single user who saw android indicated that they felt overwhelmed
Did iOS components help iOS users? Leaning towards no.
The 3 folks who saw iOS that have iOS experience (P2, P5, P9) shared 7 UX complaints between the three of them. 2 of the 3 said they felt overwhelmed.
8 of 10 users total identified the prototypes as iOS. As you can see, its split 50/50 whether or not they were actually correct. What’s important is that they guessed iOS even if they called out the hamburger (oh the hamburger) and made comments about the UI.
5 of 5 iOS users guessed that the app was iOS- even if/though they complained, felt overwhelmed, called out UI elements, or failed tasks.
My hypothesis here is that Android users simply did not “feel” like the app was of modern material design- which is, well, true. Whether they were cognizant of this or not, I am unsure. iOS users? Maybe the same could be said. Maybe.
More Android users failed a task than iOS users.
It’s a tiny data set, but still important in light of the lack of modern material design elements.
Of the people failed:
3 people failed a simple task during the arduous yet critical booking process. Still, I think 3 failures is 3 failures too many when it comes to booking. Oh- and of those 3, they all said they felt overwhelmed. Oops.
2 people failed to find where to book a flight but it is worth noting that others also struggled to find where to book for more than a few seconds.
2 people couldn’t figure out how to sort or filter flights during the booking process and gave up.
2 people couldn’t find their profile and gave up.
While being unable to find a profile might seem ho-hum, I thought it was interesting to compare the architecture here of the profile with other popular apps.
You can see in this picture that the profile is located on the Home tab, in the upper right hand corner. (It’s also front and center with a HUGE CTA, technically.)
I checked a handful of popular iOS apps here to see if United was mimicking a popular IA:
New York Times: Different
Perhaps United UX researchers drew up personas that showed that United frequent fliers are 1. business fliers who use slack in the airport and/or 2. watch a lot of youtube while they wait for their flight and 3. are ios users.
6 of 10 indicated in various ways that they felt overwhelmed or confused at times. 5 of these 6 users were either android users or saw the android app.
4 of these users where looking at the opposite platform. 5 of these users complained mostly about the vastness of the menu. It IS a massive menu.
Yet, when asked to rank “ease” at the end of this usability test, several people gave it a 4 or 5 out of 5; 2 of those people were testing their own platform (and got their guess right!) while a whopping 5 were looking at the opposite platform. The spread, importantly, was from the least experienced users to some of the highest.
I think that here we might see an indication that the standardized UI and UX of HIG helped users out after all. Even if they were android users.
Despite a small data set and the “gorilla” nature of my tests, I do think that we saw HIG-relevant UI elements have a slightly positive effect on the UX of all users. In the case of United, it combatted bad design seen elsewhere in the app. In the case of Headspace, it simply stayed out of the way.
When it comes to standard tasks, I think that we have to perform a balancing act; we need to understand the full range of frequency of use of the features and the criticality of the feature to the business and function of the app. Here, designers and stakeholders have to be realistic about the actual habituality . or loyalty of their users to their product and dareisay lean towards the negative there and design for non-loyal users who just frankly forget how to do things.
I’d like to continue to perform usability tests that will shed light on the effects of 1. native design versus non-native (but good) design and 2. how the architecture of top apps effects usability of transient apps. Any ideas out there? Lets chat.
Originally from Jackson, Mississippi, Frances moved to New York after discovering that UX Design was a real job. When she’s not at Touchlab, she can be found on the A train, juggling a poodle, a nintendo switch and some tangle headphones. Twitter: designofran
Gauging users’ reactions to non-native UI was originally published in UX Planet on Medium, where people are continuing the conversation by highlighting and responding to this story.