Saturday, November 24, 2012

Via Quora: How has eBay's review system evolved in the past few years?

The feedback system and "feedback score" is one of the first things associated with eBay (maybe second only to auction). The main purpose of any feedback system is to harness the "signals" community members send to "feedback processor" to encourage and "enforce" desirable behavior. Anyone who ever thought about or design a rating or feedback system knows that the task is not as easy as it may first seem. And the first step is to understand what the different attribute of a feedback system is.

A member of Quora community asked me to answer the question "How has eBay's review system evolved in the past few years?". The question provided me with an opportunity (and a little push) to talk about six attributes of any feedback system. See here for the full answer and if you are not using Quora, I strongly recommend you consider using it. It is addictive. 

Wednesday, March 14, 2012

What really is this "Managed Market Place" anyway?

I work for a division within eBay Inc. called "Managed Market Places". The name is a bit curious. I was asked, more than once and by range of people, what really "managed marketplace" is? Is it a new type of marketplace by eBay (no it is not!), is it a vertical/niche marketplace within eBay (no it is not!), some one on Quora even interpreted it as it means that eBay simply "manages" the marketplace as oppose to growing it! ( if this was the case why would eBay announce that to the whole word by labeling it as such?)

So then what exactly is MMP (as it is known internally) and why is it important?

the nature of the Internet lends itself perfectly to the basic concept of a "marketplace": a mechanism for buyer and seller to find each other. Marketplaces were and still are an important and growing part of the internet. The growing list of niche marketplaces include etsy, zarrly, odesk, airbnb, taskrabbit, yardsellr, zimride and many many more. (not to mention marketplaces from facebook, google, yahoo and other major players)

At the first glance, it looks simple enough: create a site that brings the parties to a transaction together (from buyer and seller of antique to two people who want to share a ride or a room), and either take a cut of the transaction or make money by advertising. This is indeed the basic concept behind a marketplace - or an unmanaged marketplace. Marketplace itself is not a party to any transaction. Buyer and seller deal with each other directly and take the risk (or bulk of the risk) of direct transaction. EBay operated, more or less, as an un-managed marketplace for a while too.

In managed marketplace on the other hand, neither party to a transaction takes a risk, in other word marketplace guarantees the success of transaction, no risk (at least ideally). Of course a managed marketplace can "manage" other aspect of interaction such as inventory, quantity, price, promotions etc. as well but for now we only focus on risk as it is the focus of eBay MMP as well.

The evolution of simple internet marketplaces to managed marketplaces is an important trend, as the Internet users become more sophisticated and demand more from services they use online. The AirBnB incident back in July of 2011 is a perfect illustration of how "unmanaged" marketplaces will be forced to offer a higher level of assurance/risk mitigation and become managed marketplaces.

What does it mean from systems and architecture point of view? Here are five main aspects that is particularly different in dealing with managed marketplaces

1- The first significant change is that of people's mind set: You have to see yourself in risk management business, or at least assume that risk management is a major part of your operations. What this changes first, and foremost, is that you now have to identify, assess, prioritize, mitigated (or plan to) and measure risk. In all likelihood all of these activities (and the tools and systems you need to perform them) are new to you if you are dealing with a simple/un-managed marketplace.
2- Central to any consumer risk management scheme is "Identity", and I don't mean OpenID or OAuth or SSO... I meant attribute, assurances, verification, accuracy, uniqueness or mapping a real world entity to a digital identity (Entity Resolution)
3- Data is the core to efficient risk management, and big data and your ability to collect and analysis them becomes central to your ability to operate the marketplace at a reasonable cost (minimum losses)
4- Coherent Architecture become even more important. Simply because your systems becomes more complex and more integrated. A simple marketplace is just that, a marketplace site/application. A managed marketplace would include identity provisioning and verification, risk definition, measurement, management at user and at transaction level, a system for filing claims and disputes, systems dealing with ever changing legal and business landscape that enforces what you can and can not do with data you collect and finally integrating all these system in a productive way (seamless but without coupling them)
5- Even Driven and Complex Event processing: This already has a big role in distributed system, but it plays more and more important role in distributed risk management. Real time assessment of risk becomes critical and due the cost/performance of risk assessment, incremental assessment or risk based on primitive and complex event generated over entire session (or even life time of a user) will be the only practical solution.

Wednesday, November 23, 2011

The Uncommon Security Common Sense

I can not claim that I actually counted or classified all the reasons peoples cite for not taking security (or for that matter sound and well thought through system design) seriously from the start, but the three following lines seems to be the most common ones:

1- The "it is too contained" line: So what is the big deal? at worst it may affect a very small percentage of my users.
2- The "it is too early" line: Oh my system/site/project is too small and we only have a few users, we really don't have time/resources for this.
3- The "it is too small" line: My project is too small or too obscure for anyone to care.

By the way, I have heard these lines or their equivalent not only when it comes to security engineering (or re-engineering) but also in designing business policies or risk management measure to prevent fraud, or in general negative user experiences as well as general system design.

Now to be fair, these reasons all sound like "common sense", after all why would you take on additional cost and time for your project or accept the expense and risk of re-engineering your code to fix an issue that may only affect 1% or 0.01% of your users? or why should you spend two weeks to fortify a system that takes you 3 days to design and it is "just an experiment"? and finally who really cares about a small project some where with some obscure URLs that takes an email address as one of its inputs and shows some useful error message if the email is not registered? does ANYONE really care?

Well, as it turns out, security common sense (like many other form of common sense) is actually quite uncommon ! Let's look at these frequently cited common sense logic a bit closer.

To demonstrate the fallacy behind the first logic (it is too contained, it only affect 0.01% of users) I cannot think of any better illustration than the words of presidential candidate Herman Cain where he said that "for each woman who has accused him of harassment there are probably thousands who haven't" and he is 100% accurate and right! But does that make any difference? In all likelihood his presidential bid is all but over. Or could the Washington D.C police chief during the "D.C Sniper Attacks" have possibly argued that the whole thing was not a big deal b/c only 0.001% of D.C metro population were actually killed and therefore there is no need for massive mobilization of police, FBI, ATF and even secret service !?

The same math is thru for security, it does not matter if only 1000 users out of 10MM become victim of a
poorly secured or design system. What matters is how many people hear and learn about it - and you can be
sure that at least in this day and age that number is a few order of magnitude larger than the actual number of
victims. The sense of insecurity that this causes in the rest of the user community and its economic cost is the real math that matters not the fact that only 1000/10MM=0.01% users were affected.

The second line "it is too early" or its equivalents "we don't have enough time or resources" is the most commons line not only in security matters but also system design and architecture aspects as well. What is interesting here is that the exact premise cited for not focusing on security (or sound design for that matter), is why security should be taken seriously i.e. "I am too new to afford not to be secure", if you are releasing a new product (or brand or a site) you REALLY DO NOT HAVE A SECOND CHANCE TO MAKE A FIRST IMPRESSION. If you are not secure, or if your first few user gets taken advantage off (think of AirBnB incident) you are doomed. To further demonstrate the risk in this argument I submit the following picture of one of the more famous car design mistakes : Honda Odyssey 1998

Honda designed this in a hurry to get into the growing minivan market dominated by Dodge/Chrysler. They decided to differentiate by replacing a convenient power sliding door with a traditional door! Imagine what would have happened of this was a new no-name company without Honda's established brand? Of course Honda corrected the mistake in 1999 model and beyond and went on to have one of the most successful Minivans. But if you are not Honda, you better spend time and money on designers and marketers to tell you,  in the first try, that whoever buys a minivan *needs* a sliding door.

Now we get to the third line "Who really cares about me?" I have to admit that I have the most sympathy with people who resort to this logic. After all it is tough to imagine how capable and resourceful the modern fraudester/hacker community is without actually having a brush with them. I do not get into the details -  if you are interested you can briefly scan Rick Howard's excellent book "Cyber Fraud, tactics, Techniques and Procedure" - for the purpose of this writing I'd suggest you assume the following is true:

In the game of "Who wants to break into my system" your adversary is more motivated (financially or politically) than you are, more experienced than you are, is more innovative than you are, is more nimble than you are, wants it worst than you do, has a smaller cost base than you do (and therefore) all he needs is 0.01% (or smaller) of your users - the ONLY advantage that you have is that you right the rule of the game. Do not give up that advantage easily. You WILL lose the game.

Btw, the end point URL that takes an email and very nicely checks and display an error if email does not belong to a valid user - was actually found (although it was a little obscure URL not linked to from anywhere -  and used to extract valid company X user emails (cost $5000+) from a large list of non-verified harvested emails (cost $50) - a vital part of phishing industry value chain.

Sunday, November 6, 2011

OIX Attribute Exchange Summit - Washington DC

Open Identity Exchange (OIX) is holding this year's Attribute Exchange Summit in Washington, DC.

Identity attributes are core of the concept of digital identity. As federated identity ecosystem getting more mature and adoption grows among more sophisticated RPs  - with more consequential use cases such as government, health, education, commerce ... - so does the need for wider sets of attributes with more accurate and fresh values. This presents both tough challenges and opportunities for IDPs.

The challenges center, as one may expect:, around aggregating, correlating, transform and maintaining fresh copy of attributes in a cost effective manner and in a way that it does not compromise the privacy (and other rights) of the principle owner. IDPs can differentiate based on the range of attributes they provide in this way and there in lies the opportunities.

I will be talking more about identity attributes, their life cycle, uses cases and how they help establish and elevate trust among parties to commercial transactions (online and off-line) as part of a panel with Don Thibeau, OIX/OIDF chairman and Abbie Barbir, VP BoA.

If you are planning to attend, I'd be happy to hear from you.

Monday, October 17, 2011

OAuth vs. OpenID Connect ?

OpenID Connet 1.0 Spec is finally released (actually it was release back in Aug). Its release was accompanied by two predictable categories of questions/sentiments, one not very well informed and the other one a legitimate question:

-        OpenID is dead
-        OpenID Connect is really OAuth so why do we need a new protocol?

Granted, this is normally coming from software engineers and social application programmer community and not from identity community, but I feel they are significant enough to be addressed, especially at the time that more and more entities contemplating to become identity providers and they need to decide which protocol they should implement.
First, on the demise of “OpenId”:  It is true that the earlier versions of Open ID (version 1 and version 2) are, for all intent and purposes, depreciated and will not gain a whole lot of traction. But the general idea of “Open” standards for communicating between RPs and IDPs that enables users to provision fewer accounts and have a portable identity while still maintaining control over their privacy and data is alive and well and actually is even more vital than before.
Second, on relationship between OAuth and OpenID Connect, OAuth is a general protocol for authorizing an agent to access a resource on behalf of resource’s owner. OAuth does not assume any particular knowledge about the resource itself. What does this mean? Let’s go back to the canonical OAuth use case of a user who would like to authorize a printing services to access her photos from a Photo service provider. Now imagine that the photo service is slightly sophisticated and recognizes a few properties associated with photos e.g. resolution, size, whether they are shots with no humans, and if there shots with humans, who appears in the photos – basically let’s assume the resource served by SP has more semantics that simple “access”.
Now imagine that the user wants to grant access to only JPEG photos of himself and not a full access to all photos. How would the IDP encode this semantics in the authorization request and response? How would the SP know that they should only provide access to a subset of images?
To be sure, this is doable using OAuth, but the implementer has to add additional parameters to request and response or possibly constraint the input values of some other parameters.
A protocol that is built this way to access a specialized resource, would be a photo access protocol built on top of OAuth.
In essence this is exactly what OpenID Connect it: It is a protocol built on top of OAuth that supports features that are often desired and used when the resources being delegated is “identity” and attribute about an identity.
To illustrate the point, here are what we, at eBay, had to do for an internal authentication protocol on top of OAuth:
      Force Authentication: adding parameters to authorization request to force users to authenticate no matter what the authentication state with IDP is
-        Authorization Behavior: adding parameters to authorization request to indicate to IDPs whether it should display the consent page and how to display the login page (overlay, full page)  
-        Standard Claim Set: Defining the default set of attributes returned by IDP
-        Requested Attributes: adding mechanism to allow RPs to ask for additional attributes, and annotating them to indicate whether explicit user consent is required.
-        Authentication Context: adding a fragment to response to communicated authentication context (single v.s multi-factor, PIN vs. Password, number of retires etc.)
-        Protection: adding parameters to indicate how access tokens should be protected (encryption, signature and order of operations)
-        Token Validation end point: adding an endpoint to introspect access tokens on demand.

These are all features and facets that OpenID Connect enables in a standard and interoperable fashion. In absence of a standard such as OpenID Connect though, any RPs integrating with our IDP had to implement basically a proprietary protocol, be it on top of OAuth.
The point is that if you want to operate an IDP and you want to use just OAuth, you have to add a few things to OAuth, depend on the depth of your requirements, to make it work for “Identity” resource. This is exactly what Facebook did with FB Connect – and they also did a good job of wrapping it with JavaScript plug-ins. The goal of OpenID Connect is to use OAuth as the basic access authorization protocol and add identity specific features to it so that it becomes a standard “identity protocol” that can enable seamless interoperability. 

Wednesday, October 12, 2011

PayPal Access & Commercial Identity

Today eBay Inc. announced an identity and attribute provider product called PayPal Access. Some described as a "Facebook Connect for Commerce", others described it as an easy registration tool for mobile site. Today at the X,Commerce Innovate Conference someone suggested to me that this is the first step for eBay Inc. to offer full cloud based user management for e-commerce sites and merchants. You can also see the official press release from eBay Inc. here.

Most of the press and coverage today focused on "Consumer Identity" - or more accurately Consumer Commercial Identity - and the benefit of PayPal Access for consumers and online merchants visited by those consumer. Consumer identity is indeed one facet of "Commercial identity" - but there is another side to commercial identity, a less understood - and arguably less sexy - side and that is Merchant Identity. What do I mean by this? Let's look at a scenario:

Merchants themselves are consumers of so many online and offline services (think of it as B2B services) - a company that sells on eBay - or any other online channel - has an eBay account, an account with a shipping company (FedEx), a Facebook account, perhaps another account with a email marketing service, bank account etc. Clearly merchants suffer from the same "account and password hell" that consumers do - but this hell is a lot deeper and hotter for merchants, consider these facts

- Most merchants have employees/contractors who create these accounts on behalf of the merchant,
- A lot of these employees (for smaller merchants) are part time or temps
- Employee turn over is high

Here in addition to the usual forgetting one's password - which for merchant leads to loss of productivity and money - sometimes the person who created the account simply leaves - if you are lucky and s/he good terms, you end up having to chase the employee and restore your access, if not, you are exposed to unauthorized access by the employee or down right "account take over".

You might say, what is the difference between this and consumer identity, these employee are consumers to and technically there is no difference. But look closer. Merchant use cases are fundamentally different. In consumer identity use cases, a consumer is a principle and gives consent on his/her own behalf to a agent (another site or application), the IDP itself recognizes the consumer is the principle and allows her (and ONLY her) to change or revoke this access. In Merchant cases, what appear to be the consumer is really not a principle binded to the merchant identity but an employee. In this case IDP must recognize this "hierarchical relationship" and allow and "admin" employee of merchant to monitor and manage the life cycle of tokens (and identities) of employees.

In the use case above, merchant X would not reveal its primary eBay user name and password to any employee, the would provision an account for each employee. Employee then logs into eBay using her own account - and via PayPalAccess - All the while PayPal Access monitors and manage all the tokens issued to all employees of merchant X. Should an employee leave or changes function, the token can be revoked by merchant X admin regardless of employee's decision.

If this sounds familiar to LDAP or ActiveDirectory, b/c it really serves the same function: Enterprise Identity, in this case enterprise is really a merchant. This is not unexpected in the world where enterprise identity, consumer identity (a.k.a social identity) are converging - and there is a need for a cloud based enterprise user management.

Please note that this is NOT an annoucement (or leak) for PayPal Access Cloud-Base user directory. IT IS NOT, REALLY. I just wanted to point out the there is two sides to commercial identity, a sexy side (consumer) and a side that can make you money (merchants).

In the next post, I will write a bit about Consumer Commercial Identity and how it may be different that social identity.

Saturday, October 8, 2011

Magician vs. Engineer

Steve Jobs, the man, died a few days ago. Steve Jobs the symbol and the icon in all likelihood lives on for a long time. In this status he is joined perhaps only by one other man: Bill Gates (whether people agree or disagree with his business tactics, feel that MSFT produced low quality or hard to use software... no one can deny the fact that he was one of the first few who realized that software would be the key to pervasive computing, co-founded the first real software company and put software engineering as a profession on the map).

That is why I was so excited to watch them being interviewed on the same stage and at the same time @ D5 in 2007. When the news of Steve Jobs passing came out, I went back and watched it again and this time I found it simply fascinating - the session is about 1.5 hours, an hour the interview and 30 minutes Q&A.

It is long, but it is well worth the time.

After about 4.5 years, and with hindsight, it is so amazing to see Steve Jobs explaining his vision about "Post-PC" era and superiority of native applications, merits of integration of hardware and software, and in general, what we can only now recognize as a general description of  iPad. (with the notable absence of any reference to App Store)

Interestingly, Bill Gates, in response to what he sees as post PC era, talks, spot on, about tablet computing, significance of touch and the "convergence device".  Keep in mind that MSFT had been doing basic research in this field for a long time.

In my view, both men shared the same over all knowledge of trends and technologies in 2007, both knew that a device that is basically touch enabled and connected will dominate the future. Then why Apple was able to come up with iPad and MSFT ended up with a few tablet from bunch of manufacturers that only a few ever saw live in action, let alone use.

Amazingly, Bill Gates answers this question himself - in what I think is the most interesting exchange of the interview: a member of audience (a woman @ about 1:22 into the video) asks both Jobs and Gates "...What did you learn about running your own business that you wished you had thought of sooner or first by watching the other guy". Bill Gates volunteers an answer first and says:

"...he (Steve) has an intuitive taste for product and people, we sat in Mac product reviews and question would come up that I would view it as engineering question b/c that is how my mind works and I'd see Steve make a decision based on a sense of people and product that is even hard for me to explain, the way he does things are just different and I think it is magical and in that case WOW" (he never mentions what that case was)

There, ladies and gentlemen, you have it! This is why MSFT and Bill Gates, as great as he is, with all the knowledge of trends, technologies could never quiet come up with that "convergence device". They (or any one else) did not have that magical "intuitive taste for product and technology". As Bill Gates said it, it is even hard for him to explain let alone replicate that intuition.

A lot of things have been said and written about Steve Jobs greatness, but none captures and express the essence of what Jobs did better than the description given by the only other icon of modern computing Bill Gates: intuition vs.solution , soul vs. mechanics, empathy vs. sympathy ..Magic vs. Engineering.