Big Question: Would a standard MLS data format matter?

Apr 3, 2009  |  Michael Wurzer

I’ve long advocated MLSs to work together on as wide of a basis as possible (nationally or regionally) to agree on a common data format for MLS listings.  My theory is that if there were a common data format, the data could move much more easily among MLS systems, enabling MLS vendors such as ourselves to more easily and effectively create regional MLSs along market boundaries.  My question is this:  Assume all MLSs all had common listing formats, would exchanging data among MLS systems be much easier?

Here are some follow up questions:

  • If data exchange were easier, would it eliminate the need for any broker to belong to more than one MLS?
  • Would it eliminate the need for duplicate listing entry?
  • Would it make IDX data feeds consistent?

And here’s the doozy of all questions on this topic:

  • If the answers to the above questions are yes, how do we get all the existing disparate formats converted to a new common format?  By this question, I don’t mean how do we agree on a format.  Rather, assume agreement is reached.  Instead, I’m wondering about the actual conversion work.  Anyone that’s been involved in a data conversion knows that it isn’t trivial.  The new formats require changes to forms, the new formats have to be learned by members, saved searches often have to be re-done, someone has to write a conversion program to transform the data, and lots of other details.  Are the costs of this effort worth the long-term benefits?

27 Responses to “Big Question: Would a standard MLS data format matter?”

  1. Craig says:

    I still have visions of regional MLS systems and would like to one day see that happen. Coming from a technical background, I know and agree that the ETL work can/is a nightmare. Just having discussions of changes to data can sometimes send me into shake just because I know what the technical person is going to have to go through to accomplish something that is so easy to speak of. I also know from a business side that getting a larger membership to agree to business rules would be more challenging, coming from a large system, there are times that I think a decision is a slap dunk to only find too many “edge” cases to make it a slam dunk, and I can only imagine the challenges as membership goes up.

    Yes, I believe if we can get to a point that agreeing on a common layout/business rules, exchange of data will move quicker, third party services will improve and the consumers will greatly benefit as well as the agents work be reduced. But then again, if we can reach that point, what’s preventing one single MLS for the whole country? Maybe one day that will be the reality.

  2. The challenge to broader data sharing is more a political one rather than technical in most circumstances. However, even a perfect “omnibus” listing format does not provide for every MLS system to be capable of interpreting or displaying the data once it’s received: If my system has periodic rentals and your system does not, how could we ever share (let alone integrate) the data?

    I still believe that the primary goal of a universal format is to foster listing syndication… which at least was the fundamental intent of a listing (an advertisement) in the first place.

    Homogeny in the management of listing data is a long, long way off.

  3. Kristen Carr says:

    I agree w/most of what Matt says (as usual) but I disagree syndication is the main goal. I worked for a very large Regional for several years then worked for a smaller Regional who exchanged listing data with 2 other bordering MLSs. The business issues around the data exchange were very difficult. If any of the 3 MLSs wanted to make field changes (adding, for instance, GREEN fields) the other 2 MLSs either:

    1) Didn’t get the data
    2) Had to add fields too – whether the same or similar

    I think creating some kind of standardization is worth the efforts. Training, updating, mapping, etc. are a huge time and money consumer but the payoff in the end would be huge! Mike – this is the very reason I keep hammering away (at my own head???) with RETS/RESO. I *believe* we can get to that point and I *believe* the agents, brokers, MLSs, vendors, 3rd parties will benefit. Standardization is the key to easily exchanging information…bottom line.

    Thanks for yet another great post!

  4. As a strong believer in “all real estate is local” and being aware that “local data fields” are relevant to marketing local real estate (which is the oil that that allows us to have these discussions), I think the best we can hope for is commonality in non-unique data fields. While data fields relating to avalanche zones are important to the marketing of real estate in the mountains in CA, they are of no value to buyers looking in FL.

    Don’t mess with my local data fields 🙂

  5. Mark Flavin says:

    Michael, I would like to echo Dave in that there will still need to be local data fields that will be different. For instance folks in Alameda and San Francisco need to deal with floating homes however people in Tracy are more worried about cattle fences. That being said a strong case should be made for standardizing the non unique fields such as bedrooms, bathrooms, price etc..

    My thought is that even though such decisions would hurt in the short term they should be defined in the RETS standard. The biggest time suck in terms of data mapping is aligning these common fields with uncommon values. However before any such standard is defined a conversation must take place as to what exactly the common fields are. A great place to start in that is IDX.

  6. I come from the same background as Craig except that it’s still my primary business. I can echo everything he says, especially since I can never seem to get away from the issue. Every single project I work on seems to involve dealing with an outside system that is controlled by someone else and the data is in a different format.

    If you don’t have a common format for fields that mean the same thing, it just makes the process so much more difficult. I just went through this with data from a single hospital where Yes/No came in as Y/N, 1/0, 1/(null), Yes/No and so on. It doesn’t sound like a big deal but someone has to be deal with it in the “T” part of ETL (Extract, Transform, Load).

    There will ALWAYS be unique data fields and they must be supported. However, if they are addressed in the standard (or “best practice”) then you will likely find out that the fields that are unique you your MLS in the region may also be appropriate in the next state or across the country. Why not define a single data field instead of 10 that mean the same thing?

  7. brian wilson says:

    I do not see the benefit. I prefer decentralization, barriers to aggregation, and local control. The only case when I think otherwise is when multiple mls’s serve the same market. Otherwise agents only work 1 market so why would it matter to them? Building codes are local, schools are controlled locally… some things do not get better as they get bigger.

  8. CWallace says:

    After working in the Bell System and being involved with a conversion to USOC – Universal Service Order Codes that all of the Bell System used for data entry I was shocked and frustrated upon entering the real estate industry that there was no standardization. It would certainly make data integrity more achievable. I just had to ‘massage’ the entry of the # of bathrooms so that it would appear ‘correct’ in a 3rd party software.

    However, I don’t see this being achieved in the near future as there are too many ’empires’ clinging to maintain their independence from regionalization or for other political reasons. What a shame….standardization would make data more accurate and using 3rd party software less frustrating.

  9. Ron Stephan says:

    While I can appreciate the complexity of matching a city MLS database to a rural database I think for the most part data fields are in fact common. I guess it depends on what direction you are going. It seems that it was relatively simple for 60 MLS’s in Florida comprised of city and rural MLS’s to agree to send data to the statewide MLS Advantage product without getting into the bits and bytes of propreitary details. If there was something so uniguie to a local market it quite simply didn’t make it to the overlay or was put somewhere else in the overlay.
    Likewise if we were to accept data back from MLSA it would be general data fields that were defined and if we didn’t have a particular data field (I can’t think of one we wouldn’t) then we would have to put the info somewhere else or simply not use it. Would it create the perfect search result? Maybe not, but as long as the listing was identified as coming from MLSA I think our members are intelligent enough to adapt.
    What members are looking for from “other areas” are the private fields of information such as remarks that help them make a decision about wether or not a property meets their requirements.

  10. Thanks, everyone, for your comments. Unfortunately, nearly everyone has focused on the issue of whether common data formats can or should be created. My question, though, assumed such common data format had existed or was agreed upon and instead asked: Would exchanging data be easier if there was a common format?

    Let’s take Ron Stephan’s Florida example, where many MLSs have submitted their data to FAR for inclusion in eNeighborhoods’ MLS Advantage. Presumably MLS Advantage harmonizes all the disparate data fields into a common format. So, my question is why haven’t all or any of the local MLSs converted their system to that common format? If all the MLSs were on the same format both locally as well as regionally, wouldn’t that make it easier for a wide variety of data exchanges?

    My point really is this: There are and have been agreements in many parts of the country on common data formats, but yet the local MLS rarely converts to that common format. Why is that? Is it because of the need for local fields or is it the expense and difficulty of the initial one-time conversion? Or is it that the perceived benefits do not outweigh the expenses? These are the questions I’m trying to understand better.

  11. Realistically, there is no local gain for changing base formats, only an expense incurred from the conversion. If tomorrow’s MLS will look exactly like today’s MLS, but require a $50K data conversion exercise, why bother?

    Data format simply isn’t a local concern, and I doubt it will ever be. And, reiterating my own and to Ron’s point, even if Advantage devises a “perfect” data format, it does not mean that the local MLS (software) knows how to interpret it semantically. At that point, your only option is to agree on a common system, not just format… and that is unlikely to happen in any broad (meaningful) manner.

    -Matt

  12. >>why bother?

    Wouldn’t it make exchanging data with other MLSs in the region easier, thereby lowering costs for IDX maintenance, joining multiple MLSs, dual or triple (or more) listing maintenance?

    >>does not mean that the local MLS (software) knows how to interpret it semantically

    If they underwent the one-time data conversion exercise, they would.

    >>only option is to agree on a common system, not just format

    Why is that the only option?

  13. Ron Stephan says:

    Are the costs of this effort worth the long-term benefits? When MLS Advanatge started there was a cost to the local MLS or Association to participate. Our local MLS’s surrounding us did not participate at that time and therfore the cost to us for what we perceived to be only 10% of our members was NOT acceptable. Once the costs were removed it was a no brainer for us to sign up. The state association ate the costs and provide MLSA as a member benefit. Again once the costs were removed it became a matter of convincing the surrounding smaller MLS’s that participating in MLSA and guaranteeing the compensation was good for their members as well. Once their individual members found out about the product, that it was free or at least it was already built in to their state dues it was almost a requirement that their respective boards sign up. It has resolved numerous overlapping market disorder issues and has helped to develop a much less anatgionistic relationship among the various boards. Have any brokers quit participating in more than one of the boards? The answer is minimal. The associations and MLS’s involved have already or are in the process of working out the details on any conflicting membership requirements so that brokers may have a branch office as a full participating MLS office under either a secondary or primary boards membership. So in fact some of the brokers have been able to reduce multiple board and mls costs as a result of the MLS Advantage overlay.

  14. > Wouldn’t it make exchanging data with other MLSs in the region easier, thereby lowering costs for IDX maintenance, joining multiple MLSs, dual or triple (or more) listing maintenance?

    It still depends on whether you are talking about data integration (being able to create the identical listing in every system) or purely data aggregation (being able to view any system’s listing in the local system). Data integration involves rule and logic assignments that will vary by system and some may not be at all possible in some systems. As my shining case in point because it is so specialized: periodic rentals (i.e., weekend/weekly). *Most* systems do not have this facility at all, so how can another system be expected to modify the data?

    >> does not mean that the local MLS (software) knows how to interpret it semantically
    > If they underwent the one-time data conversion exercise, they would.

    There is also the required logic conversion that may or may not be possible and would likely raise the cost of conversion significantly.

    >> only option is to agree on a common system, not just format
    > Why is that the only option?

    To ensure that input/output/display/exchange rules are homogenous and compatible.

    -Matt

  15. Ron, I would love to see data on whether MLS Advantage has reduced duplicate entry. More specifically, is the search traffic on MLS Advantage significant enough that the broker or agent is comfortable NOT entering the listing into MLSs other than the home MLS or do they go ahead and continue entering it multiple times to ensure it hits the hot sheet in all the local markets.

  16. Brian Wilson says:

    I still don’t understand the “why.” Besides those few markets where there are overlapping or competing MLS’s and common standards would make sense for those that pay for the MLS tool (the agents), why else is this important to the local MLS?

    I can see that one advantage would be that IDX / VOW vendors would have an easier time offering innovative search systems to more MLS areas and the agents could benefit from this newer technology but I don’t see this benefit outweighing the risk of other IDX / VOW vendors aggregating and disintermediating agents from their own local customers.

  17. >> those few markets

    There are many markets where MLSs overlap. You’re right, there also are many where they do not, and this topic doesn’t apply to them.

  18. Ron Stephan says:

    While i don’t have specific data I do know thatr our reciprocal ( only the broker and the agent who wants to join instead of full office) membership count is down from those MLS’s that have chosen to participate in MLSA but our full membership has increased as a result of some surrounding MLS’s not participating in MLSA. Of course we have also seen an increase in full membership because of the higher level of service that we provide. e.g syndication, staff education support etc

  19. Well, we have an MLS on our border to the south using a different system but there are SEVEN others near us that all use the same system that we do. The odd one is in a very attractive area but is really more geographically oriented to their neighbors to the south which is an entirely different, but large MLS and system.

    So, regarding the expense of conversion, what’s a ball park estimate for the cost to each of our FlexMLS systems? This would of course require conversion of the data to the new common formats as well as all the forms and reports. Not to mention how to handle each of the unique fields.

    So, even if Michael says zero cost, we still have to get 8 groups together and agree on the common names etc. In this case it would really help to have FBS lead the way as to which is the better name and data type.

    That leads to another reason for resistance: Politics

    There may not be a good reason why one way is best, but there will be disagreements. Again, it may help to have a common third party lead the way.

  20. Craig says:

    Michael, I suppose that I started this thread off on the wrong foot because my mind works on several problems from several positions, not on weather anything is worth while, but how.

    “my question is why haven’t all or any of the local MLSs converted their system to that common format? If all the MLSs were on the same format both locally as well as regionally, wouldn’t that make it easier for a wide variety of data exchanges? ”

    Percieved value / effort (pain and cost). Some IDXs IMHO are worthless and are not even worth the time to evaluate, otherwise appear to be valuable until you use them and find them less than ideal. I beleive in change, I know that change is necessary, but I won’t change for the sake of change, it has to have purpose and direction.

    Just recently, we learned that we don’t have a definition of a bedroom, and the debates that sparked from trying to create one was incredible, imagine how hard it will be to do this with all the fields, or even the most important ones like Square Feet. I believe it is a worth while exercise just for the simple fact that it will make a better MLS, a better solution when everyone knows what a “bedroom” is. Once we can get agreement of that, then we can start to work toward the next step, integration. We define bedroom as a …., how do you define it? Can we integrate your stuff with our stuff? Is there value in doing that, absolutely.

    So running around your question for a bit, the short answer is yes, a common format is important, and is of the furture, it’s not if, it is when, we have to do this, we can’t go forever in local systems, the world is getting smaller everyday. Yes there will be local parameters and local experts, but we need to get data out there to the world for the local to be valuable.

    My point is that as an adult we need to speak and think at an advanced level, but first we must learn to just say the words and understand what they mean, then speaking in complicated sentences and sharing complex thoughts come naturally.

    Regional (and eventually larger) systems will come, the challenge is getting ready to do that work, learning our fields and what they mean and how they can be used correctly is today.

  21. Ron Stephan says:

    To Craig’s comment a bedroom has a door a window and a closet if that helps:) However we have to give our members some credit for common sense. If it’s NOT a bedroom e.g. a curtain over the entrance to the hallway, I’m sure they will let us know really fast that it’s not a real bedroom. A loft typically has few if any walls and doors so I don’t think a member would expect a strict bedroom definition to apply to a loft. We can try at a local level to categorize these technicalities but at a higher level if some one is looking for a loft they would not expect the first definition I gave to apply. My point is we can’t possibly satisfy everyone’s definition. If we are exchaning data elements we have to be prepared to slot the unusual items (loft) somewhere else and allow for the exception in the rules.

  22. Just like with unique fields, there are always exceptions.

    Consider the old farm house with no closets in the bedrooms.

  23. Craig says:

    Okay, so I did it again, started a tangent, back to the original question, is there value in a single layout, yes, huge value.

    Why don’t more MLSs want to convert to a single layout? It’s rather difficult, to the point it over-shadows the huge value from question 1.

  24. Distraction is common with propellor heads…Ooo, donuts!

    Sorry, where was I? Oh yes.

    @Michael W said:

    “Would exchanging data be easier if there was a common format? ”

    Absolutely.

    “If data exchange were easier, would it eliminate the need for any broker to belong to more than one MLS?”

    That might be a business question since it may depend on MLS rules.

    “Would it eliminate the need for duplicate listing entry?”

    If the fields match and the rules are met, I would say yes.

    “Would it make IDX data feeds consistent?”

    I have no doubt.

    “…how do we get all the existing disparate formats converted to a new common format? … I’m wondering about the actual conversion work. Anyone that’s been involved in a data conversion knows that it isn’t trivial.”

    Huge project and not trivial. Since you are getting into the technical details:

    Off the top of my head (and a simplistic response), I would probably start with an entirely new database “instance” based entirely on the new standards (I’m a SQL Server guy, not DB2 so the terminology may be different).

    Then I would set up a way to map everything from the old system to the new one (ETL for those playing along at home). I have done it with SQL code and SSIS (old vs latest) but in any case you would want to be able to reload the new database as needed until you get everything just right.

    “The new formats require changes to forms, the new formats have to be learned by members, saved searches often have to be re-done, someone has to write a conversion program to transform the data, and lots of other details.”

    Exactly. Loads of testing too but as I recall you went through a major upgrade pretty well not so long ago.

    “Are the costs of this effort worth the long-term benefits?”

    Only you can answer that once you have a better idea of the requirements. Just gathering those will be time ($) consuming as well.

  25. Robbie says:

    I think Matt nailed it when he said “Realistically, there is no local gain for changing base formats, only an expense incurred from the conversion. If tomorrow’s MLS will look exactly like today’s MLS, but require a $50K data conversion exercise, why bother?”

    The problem is too many people in real estate think locally, and data formats are a global problem. Many (most?) people only need to belong to one MLS, and thus they only see cost, instead of opportunity.

    In Western Washington, we essentially have one large regional MLS that controls the listings from the Columbia River to the Canadian border. They distribute data via a proprietary SOAP based web service and nobody except IDX vendors and large multi-state broker e-Marketing or IT depts really care. Perhaps the growth of Zillow & Trulia feeds and cost of developing them will bring this issue into greater focus?

    I think the real estate industry should attack this problem in small chunks, because getting a real standard is too complex politically. Perhaps, it’s too expensive technically because the industry as a whole has ignored the problem since at least the beginning of the internet era.

    For example, when I once attempted to support 2 MLSs with my IDX software, I was shocked at the differences between them. Let me list the differences in how they handled listing images (I’m not even talking about the listing data, it’s schema, or transport here, just the images)…

    Some MLSes use FTP for photo download. Some use HTTP for photo download. Some have all the images online all the time. Some only have the changed images online. Some store the images individually. Some combined changes in ZIP files. Some store the images in different sub-directories based on the last 2 digits of the MLS #. Some use a greater or lesser number of sub-directories to store the images. Some use a letter suffix in the filename to determine the order of the images. Some use a number suffix in the filename to determine the order of the images. Sometimes the first image of listing has a suffix. Sometimes the first image of listing doesn’t have a suffix. And don’t get me started on image sizes or thumbnails…

    I know getting reasonable people to agree on what a bedroom is or how does one measure sq ft is perhaps a long winded political issue, but perhaps instead of creating a standard, maybe the industry would be better served eliminating differences and driving adaption of these standard practices. It has the same long term effect as creating a standard, but has much less short term disruption.

    Think agile. Instead of one HUGE task, how about 30 smaller ones that have the same end result?

  26. […] a post last week, I asked a pretty open-ended question: “Are the costs of an MLS [coverting to a standard […]

  27. “Some MLSes use FTP for photo download. Some use HTTP for photo download. Some have all the images online all the time. Some only have the changed images online. Some store the images individually. Some combined changes in ZIP files. Some store the images in different sub-directories based on the last 2 digits of the MLS #. Some use a greater or lesser number of sub-directories to store the images. Some use a letter suffix in the filename to determine the order of the images. Some use a number suffix in the filename to determine the order of the images. Sometimes the first image of listing has a suffix. Sometimes the first image of listing doesn’t have a suffix. And don’t get me started on image sizes or thumbnails…”

    Some PUSH photos via FTP. Some allow the last three days worth of new or changed to be downloaded at any time of the week, then on the weekend will allow the full photo download (which they recommend you do, as inevitably some photos do not make their way into the three day weekly incremental downloads).

    With the new offerings from Amazon, Limelight, Edgecast, etc. for content delivery, I’d love to see every image hosted across the board (so to speak) by the boards. I believe that Amazon is even coming out with a new pricing model where you (the MLS) could bill your client (broker or vendor) for the hosting. Agents and Brokers need to demand more from these MLS’s- simplification of data and photo distribution, which will lead to more exposure with less cost and problems overall.