Copyright law is a triple-edged sword. Historically, it has been used to make sure that authors and rock musicians get their due, but it’s also been extended to the breaking point by firms like Disney. Strangely, a concept that protected creative arts got pressed into duty in the 1980s to protect the writing down of computer instructions, ironically a comparatively few bytes of BIOS code. But as long as we’re going down this strange road where assembly language is creative art, copyright law could also be used to protect the openness of software as well. And doing so has given tremendous legal backbone to the open and free software movements.
So let’s muddy the waters further. Looking at cases like the CDDB fiasco, or the most recent sale of ADSB Exchange, what I see is a community of people providing data to an open resource, in the belief that they are building something for the greater good. And then someone comes along, closes up the database, and sells it. What prevents this from happening in the open-software world? Copyright law. What is the equivalent of copyright for datasets? Strangely enough, that same copyright law.
Data, being facts, can’t be copyrighted. But datasets are purposeful collections of data. And just like computer programs, datasets can be licensed with a restrictive copyright or a permissive copyleft. Indeed, they must, because the same presumption of restrictive copyright is the default.
I scoured all over the ADSB Exchange website to find any notice of the copyright / copyleft status of their dataset taken as a whole, and couldn’t find any. My read is that this means that the dataset is the exclusive property of its owner. The folks who were contributing to ADSB Exchange were, as far as I can tell, contributing to a dataset that they couldn’t modify or redistribute. To be a free and open dataset, to be shared freely, copied, and remixed, it would need a copyleft license like Creative Commons or the Open Data Commons license.
So I’ll admit that I’m surprised to have not seen permissive licenses used around community-based open data projects, especially projects like ADSB Exchange, where all of the software that drives it is open source. Is this just because we don’t know enough about them? Maybe it’s time for that to change, because copyright on datasets is the law of the land, no matter how absurd it may sound on the face, and the closed version is the default. If you want your data contributions to be free, make sure that the project has a free data license.
33 thoughts on “Copyright Data, But Do It Right”
Of course, one wonders if our patron saint of openness didn’t foresee this problem, otherwise there would have been an inclusion in the GPL.
>Historically, it has been used to make sure that authors and rock musicians get their due
Or rather, that their publishers and labels who actually own the copyrights get their money.
(GIF of Claymation Grommit slowly nodding)
Good reason to have an understanding of contract law then.
I did a fairly basic intro to music copyright law twenty years ago and oh man, do they see fresh naive artists coming. For example it’s not uncommon to discover significant percentages of record sales being discounted from artist revenue in clauses that were originally added to cover the expected loss of pressed records due to the brittle nature of Bakelite.
Part of me hopes the “standard recording contract” has changed since then to better reflect digital music sales, but I wouldn’t be at all surprised if it hasn’t.
Say, if you’re in a band that is officially the property of the label, is your work “for hire” and you own none of the copyrights to your own music by default?
Copyright law got twisted FAR beyond where it should have when it was applied to computer code.
Copyright law in the US is explicitly for creative products only – anything functional is excluded from protection.
This means that the only part of a computer program that should legally be protected is the comments – everything else is functional instructions to the computer to do something.
Patent law also got twisted here. Patents are only meant to protect physical devices and chemical processes that irreversibly change one material to another. Lawyers decided to argue that the change in arrangement of electrons in a computer constituted a process, despite such a change and both starting and ending states being unique not only to each computer the program runs on, but different for EVERY SINGLE RUN ON EACH COMPUTER.
Copyright should not protect executable code – that is entirely functional (data used by the program might be protected, but any data that is used to enable/disable functionality like the Nintendo logo used as DRM in early Nintendo consoles loses copyright for that use).
Patents should not protect programs – they are not devices, nor do they inherently cause irreversible chemical changes.
If lawyers want to protect computer programs, they should lobby for a new type of intellectual property law.
I think one big problem with this is that art is also functional, so there isn’t really this distinction between functional computer instructions and non-functional artwork. Or perhaps I should say that it can be reasonably argued that art has a function.
Anything functional is patentable. Copyrights should not extend to what is already covered by patents, or they would cause indefinite monopolies on inventions.
Historically it has been used for censorship. When the censorship was lifted it was kept to protect the economic interests of the printing presses that used to enforce censorship.
We had authors and musicians for thousands of years before copyright. What Shakespeare did (getting “inspiration” for his plays) would be copyright infringement today.
>What prevents this from happening in the open-software world?
The fact that there could be a copy of the database somewhere and anyone “buying” it couldn’t secure exclusive access. The issue is that datasets are copyright by default, which means copyright is the cause to the whole problem – not the solution.
Copyright should not be opt-out: If you want to protect it, you should have to apply and pay for the protection like with patents.
Yeah, why can every pleb without money claim ownership of their IP?!? Socialism!
Not defending ADSB Exchange, but like most sites fed from community contributions they probably had some fine print about uploaded data being licensed to (if not straight up owned by) said service.
Well, there you hit on the next obvious problem: if nobody owns the copyright by default, then who are you buying it from? It would require that the state claims ownership of all creative works, which is nonsense. In other words: copyright should not exist in the first place.
You realise this would only serve to make the small guy poor? Big guys would pay out to copyright everything.
Small guys would have to choose between pouring money into copyrighting things or having them stolen.
Patent law in the US is no longer first to invent, it is first to file. Granted, that was 10 years ago now but still noteworthy.
Patent law is definitely a valid topic of conversation. But patents and copyrights are different things serving different purposes so not really today’s topic.
You realize it’s the big guys buying the IP off either way? When you sign a publishing contract, you sell your copyright away – or else you won’t get the publishing contract. Copyright doesn’t serve the “small guys” one bit because they can’t effectively protect it or do anything with it except sell it to some big corporation.
That’s also why the established and famous authors keep starting their own publishing agencies and record labels to keep hold of their own IP.
With copyrights being like patents, it would cost way too much for corporations to hold unimaginably huge portfolios of copyrights like they do now, and they would expire sooner either way.
The problem with the opt-in approach is: under that regime, if I write someone a letter (yes, I’m that old) I no longer own the copyright. Who does? No one? No one. J.D. Salinger would be screwed. (If you have the internet in your town you can look up that situation.)
Would it have killed the author to say “datasets” instead of “data” in the title? As Dennis Rodman once said “She ‘bated me.” Hackaday baited me in this instance, and it’s not quite the same feeling.
A triple edged sword is an epee.
Unlike a Broadhead arrow point.
(Working from the Title image)
I see what you did there, but, an epee has no edge, just a point.
Technically most if not all weapons with the triangular blade profiles do have some examples with sharpened edges from history somewhere. Presumably the thinking goes if I get the opportunity to put in a painful slash I should – Though they are still clearly designed for the thrust being thicker more stout construction for that inflexibility to really drive that point home compared to a cutting focused blade where the spine wants to stay thin so they tend towards springiness.
They also cause a more unstable wound that does not readily close on its own- at least not without “modern” suture techniques. They were the medieval hollow-points. The thrust of which was to inflict a casualty of a longer duration and higher probability of death due to infection and other pestilences. An important factor in a siege battle that could last years.
Inetrestingly enough OpenStreetMap project, which is used as a backdrop for ADSB Exchange, has its data protected by ODbL.
Why not just leak the datasets? Anonymity makes copyright unenforceable. Who is copying what? We don’t know!
You can’t use the leaked datasets without getting sued over it.
If you use them anonymously, who is getting sued?
Indeed they do.
“They aren’t evil overlords”.
-no one ever
It is too late for CDDB, but ADSB Exchange relies on current data. Unless those people supplying the data agreed to continue to supply it, they could just cut off the data, and the site would become useless.
If you factor in part of the reason it was bought was to add the “filter for a fee” feature. That some very rich people will happily pay, so that competitors can not track their geographic movements in near real time. But even if they only having the ability to filter historical data, that is still a very valuable service to many. Most of the time historical data over long periods of time is far more valuable than real time data once it is cross correlated with other harvested metadata. The sum of the total metadata is more valuable than the individual metadata sources.
Oh please, the fate of ADSB was decided by the script kiddies that abused it for the sake of achieving internet fame. This is not an example of where copyright would have helped at all, rather they poked a bear and got mauled as they deserved.
When you start abusing the data to stalk people then it’s no longer a project dedicated toward the greater good, it’s a security hole that needs plugging. If you can’t be responsible with these tools then they will be taken away, and the judges and juries won’t side with you even if the books say they should.
Some people and institutions are just untouchable for the sake of peace and order, and they will get everything they want in order to maintain that. Don’t bother trying to change that, just move on and make a new version that won’t land users in hot water.
Not sure this works that well in the US (yet, anyway). AFAIK, Feist and West v. Mead are still good law.
Please be kind and respectful to help make the comments section excellent. (Comment Policy)