New technology often brings with it a bit of controversy. When considering stem cell therapies, self-driving cars, genetically modified organisms, or nuclear power plants, fears and concerns come to mind as much as, if not more than, excitement and hope for a brighter tomorrow. New technologies force us to evolve perspectives and establish new policies in hopes that we can maximize the benefits and minimize the risks. Artificial Intelligence (AI) is certainly no exception. The stakes, including our very position as Earth’s apex intellect, seem exceedingly weighty. Mathematician Irving Good’s oft-quoted wisdom that the “first ultraintelligent machine is the last invention that man need make” describes a sword that cuts both ways. It is not entirely unreasonable to fear that the last invention we need to make might just be the last invention that we get to make.
Artificial Intelligence and Learning
Artificial intelligence is currently the hottest topic in technology. AI systems are being tasked to write prose, make art, chat, and generate code. Setting aside the horrifying notion of an AI programming or reprogramming itself, what does it mean for an AI to generate code? It should be obvious that an AI is not just a normal program whose code was written to spit out any and all other programs. Such a program would need to have all programs inside itself. Instead, an AI learns from being trained. How it is trained is raising some interesting questions.
Humans learn by reading, studying, and practicing. We learn by training our minds with collected input from the world around us. Similarly, AI and machine learning (ML) models learn through training. They must be provided with examples from which to learn. The examples that we provide to an AI are referred to as the data corpus of the training process. The robot Johnny 5 from “Short Circuit”, like any curious-minded student, needs input, more input, and more input.
Learning to Program
A primary input that humans use to learn programming is a collection of example programs. These example programs are generally printed in books, provided by teachers, or found in various online samples or projects. Such example programs make up the corpus for training the student programmer. Students can carefully read through example programs and then attempt to recreate those programs or modify them to create different programs. As a student advances, they usually study increasingly complex programs and they start combining techniques discovered from multiple example programs into more complex patterns.
Just as humans learn to program by studying program code, an AI can learn to program by studying existing programs. Stated more correctly, the AI trains on a corpus of existing program code. The corpus is not stored within the AI model anymore than books studied by the human program are stored within the student. Instead, the corpus is actually used to train the model in a statistical sense. Outputs generated by the trained AI do not come from copies of programs in the corpus, because the trained AI does not contain those programs. The outputs should instead be generated from the statistical model of the corpus that has been trained into the AI system.
AI Systems that Generate Code
GitHub Copilot is based on the OpenAI Codex. It uses comments in the code of a human programmer as its natural language prompts. From these prompts, Copilot can suggest code blocks directly into the human programmer’s editor screen. The programmer can accept the code blocks, or not, and then test the new code as part of their program. The OpenAI Codex has been trained on a corpus of publicly available program code along with associated natural language text. Public GitHub repositories are included in that corpus.
Copilot documentation does claim that its outputs are generated from a statistical model and that the model does not contain a database of code. On the other hand, it has been discovered that code suggested by the AI model will match a code snippet from the training set only about one percent of the time. One reason for this happening at all is that some natural language prompts correspond to a relatively universal solution. Similarly, if we were to ask a group of programmers to write C code for using binary trees, the results might largely resemble the code in chapter six of Kernighan & Ritchie because that is a common component in the training corpus for human C programmers. If accused of plagiarism, some of those programmers might even retort, “That’s just how a binary tree works.”
But sometimes Copilot will recreate code and comments verbatim. Copilot has implemented a filter to detect and suppress code suggestions that match public code from GitHub. The filter can be enabled or disable by the user. There are plans eventually provide references for code suggestions that match public code from GitHub so that the user can look into the match and decide how to proceed.
Is Learning Always Encouraged?
Even if it’s very rare that an AI model trained on a corpus of example code later generates code matching the corpus, we should still consider instances where the code should not have been used to train the model to begin with. There may be limits to when and which source code can be used for training AI models. Looking to the field of intellectual property, software can be protected by patent, copyright, trademark, and trade secret.
Patents generally offer the broadest protection. When a system or method practices one or more claims of a patent, it is said to infringe the patent. It does not mater who wrote the code, where it came from, or even if the programmer had no idea of the existence of the patent. Objections to software patents aside, this one is straightforward. If an AI model generates code that practices a patented method, it does not mater if that code does or does not match any existing code, there is a real risk of patent infringement.
Trade secret only applies in the highly pathological situation where the source code was misappropriated, or stolen, from the original owner who was acting to keep the source code secret. Obviously, stolen source code should not be used for any purpose including the training of AI models. Source code that has been published online by its author or owner is not being protected as a trade secret. Trademarks only really apply to names, logos, slogans, or other identifying marks associated with the software and not to the source code itself.
When considering AI model training, copyright concerns can a little more nuanced. Copyright protection covers original works of authorship fixed in a tangible medium of expression including literary, dramatic, musical, and artistic works, such as poetry, novels, movies, songs, computer software, and architecture. Copyrights do not protect facts, ideas, systems, or methods of operation. Generally, studying copyrighted code and then rewriting your own code is not an infringement of the original copyright. Copyright does not protect the concepts or operations of computer code, it merely protects the specific expression or presentation of the code. Anyone else can write their own code that accomplishes the same thing without offending the copyright.
Copyright can protect computer code from being reproduced into other code that is substantially similar to the original. However, copyright does not protect against reading, studying, or learning from computer code. If the code has been published online, it is generally accepted that others are allowed to read it and learn from it. At one extreme, the concept clearly does not extend to reading the protected work with a photocopier to make a duplicate. So it remains to be seen if, and to what extant, the concept of being free to read will extend to “reading” the copyrighted work into an AI model.
Law and Ethics Controlling the Corpus
There is litigation pending against GitHub, Microsoft, and OpenAI alleging that the AI systems violate the legal rights of programmers who have posted code on public GitHub repositories. The lawsuits specifically point out that much of the public code was posted under one of several open-source licenses that require derivative works to include attribution to the original author, notice of that author’s copyright, and a copy of the license itself. These include the GPL, Apache, and MIT licenses. The lawsuits accuse defendants of training on computer code that does not belong to them without proper attribution, ignoring privacy policies, violating online terms of service, and offending the Digital Millennium Copyright Act (DMCA) provisions that protect against removal or alteration of copyright management information.
It is interesting to note however, that the pending suits do not explicitly allege copyright violation. The defendants posit that any assertion of copyright would be defeated under the fair use doctrine. The facts do appear to parallel those in Authors Guild v. Google where Google scanned in the contents of books to make them searchable online. Publishers and authors complained that Google did not have permission to scan in their copyrighted works. However, the court granted summary judgement in favor of Google affirming that Google met the legal requirements of the fair use doctrine.
An interesting open project for the development of source code models is The Stack. The Stack is part of BigCode and maintains a 6.4 TB corpus of source code under permissive license. The project seems strongly rooted in ethical transparency. For example, The Stack allows creators to request removal of their code from the corpus.
Projects like Copilot, OpenAI, and The Stack will likely continue to bring very interesting questions to light. As AI technology advances in its ability to suggest code blocks, or eventually write code itself, clarity around authorship rights will evolve. Of course, authorship right may be the least of our worries.
I wonder how many folks will still be annoyed if every program written with AI aid, so likely trained only on stuff under various open licenses, is always automatically copy-left licensed itself. You do loose the attribution as don’t want to waste time looking for all the prior programmers that contributed bits here and there, but at least the results are supposed to be available – so the benefits of open-source the programmers obviously believed in enough to publish under remain.
Still the chance of the AI killing off folks jobs, which is perhaps as much the reason for kicking up a fuss and it doesn’t mean all the prior licences will be entirely complied with, the GPL versions,MIT etc are similar but not the same as each other – but at least it stays permissive – which has to be better than taking stuff that should be copy-left style only and using it incorrectly…
There’s a world of difference between “open source” licenses.
Someone who chose GPL may not want their derivative work used in a commercial product where source is not available. Someone who chose MIT may only want attribution and doesn’t want their derivative work used where source must be made available. Heck, there’s even disagreement between GPL-users. Linus will not use GPLv3 due to the additional restrictions it requires, but the SAMBA team will not use GPL2 because they don’t want it locked away like TiVo once did.
Smearing each creator’s wishes behind an “at least it’s open source” facade is just too disrespectful of their intent in creating the content, IMHO.
Indeed I did note that issue – but forcing the AI assisted programs to the most permissive copy-left licensing terms from the training data has to be the better option than all this open source code being used freely for closed source stuff – that is definitely against the majority of original programmers intent no matter which of the many license options they used!
And as the training data didn’t care trying to sort it out afterwards would be impossible, so the question has to be is it sufficient to force the open copy-left style and allow the use of the tool. As that is really about the only practical option that actually lets the tool be used.
“You do loose the attribution[…]”
https://www.grammar-monster.com/easily_confused/loose_lose.htm
HTH
Chinese AI is already stealing ALL the code.
Only if AI advances to a point where it can innovate.
I’ve tried OpenAI to solve some really obscure problems and it didn’t come up with one single usable answer. I guess all it can do well is generate code for checklists, a dvd or movie database etc because those are the most common examples on the internet. As soon as you try something that’s off the beaten track it’ll produce nonsense.
But it’s great that whenever it does give you proper answers, although chances are what you’re trying to do is reinventing the wheel and should use a library instead. It would be great if the AI sees you doing things and suggest using a library that not only does it do exactly what you want, but it does it much better and more secure than the average programmer can do him/her self.
I think AI is great for doing the menial and repetitive tasks of programming like refactoring, filling in scaffolding, searching for common mistakes and security holes etc so that you can focus on solving the actual problems that AI have never seen before and make a mess of trying to solve.
An AI can’t really do inspired thinking. Like, this piece of metal, bent with several sharp turns is something to hold papers together, then think of unbending it to poke in a hole to unlock a dvd drive for example, not unless it has come across several references in its training set.
Agreed. Now…I do think AI poses severe dangers if not vigorously restrained and controlled. A Navy funded study, a few years back, said much, recognizing possible “Skynet” scenarios.
However, I do not believe AI will ever be truly conscious or self-aware…even if capable of pretending it is.
I’m far less worried about AI becoming truly conscious or self-aware than I am of it becoming heavily armed
if a AI is capable good enough to pretend to be conscious that you don’t recognize it not being it then whose to say it isn’t.
If you are training it on github etc examples willy-nilly, you’re training it with bugs.
Garbage in, garbage out.
Github, forums, stack exchange, decades of wrong ideas and bad advice. Now, if it could clean the internet of all the old garbage that shows up in searches….
Maybe it wont be code.
The thing was that people in the past said the low skilled low paid jobs would be replaced with robots.
Now that does happen but robots cost money initially and on going.
Replacing a job with AI has a much higher ROI given that the job you replace is likely desk based creating digital things and likely higher paid.
So the idea that the blue collar / lower classes will all loose their jobs because of automation is rapidly turning into being the white collar / middle classes.
Suddenly universal income got a lot more appealing.
A parasite, once attached to a reliable host, has no need or incentive to do anything futher.
If you think guaranteed income will “free” us to suddenly become a nation of 300 million artists and craftsman, think again.
In the end, “universal income” is a phrase used to sound brainy when in fact one is really proposing the dumb idea of paying people to sit on their rears. Its arrival in any wide-sweeping form will mark the beginning of the end of civilization.
Don’t look now; we’re already paying people to sit on their rears.
Under the current regime, you may have significant benefits as an unemployed person that you would lose the moment you do find a job. There’s actually incentive to NOT work, in many cases.
UBI would not be conditional on employment, so any small amount you earn over UBI would only contribute to your well being, not disqualify you for existing benefits. And UBI should be set very near subsistence levels; it SHOULD be uncomfortably close to poverty. If someone is ok living at that level, I’m here to tell you they’re already living off of welfare.
This is the way UBI should work if it’s implemented. But it’s not how most advocates of it want it to work. They expect generous UBI paid for by the golden goose of “taxing companies”.
This is similar to how I envision it. Imagine all of the unnecessary welfare administration infrastructure removed, and that money distributed as UBI. You don’t need a bunch of caseworkers, forms, etc. Adults get a certain amount per month. Period. If you want more, you do more. If you don’t, the government doesn’t attach any stigma to you. Your family, friends, and society in general may or may not, however.
It would cause perverse incentives, because the cost of living is not the same for everyone and everywhere. Large parasitic communities could exist on these wealth transfers alone. People would also band into communes that pool their UBI, or start having larger families to “farm” the system…
>UBI would not be conditional on employment, so any small amount you earn over UBI would only contribute to your well being
That would be a direct subsidy to corporations, because it enables poor people to accept jobs that do not pay a living wage. Competition between workers enables corporations to cut pay, especially at entry level and low-skilled jobs. This is the “Walmart effect” – any social welfare you get is subtracted from your salary as things progress over time. This is why you should either work, or receive welfare, not both at the same time.
As a corollary, UBI has a second harmful effect in enabling jobs that are not sustainable in the normal economic sense. A job that can’t pay its workers a living wage provides negative value to the society overall, because by definition it doesn’t return what it costs to perform. Societies working in this way simply become poorer and poorer. Yet with UBI, the true cost of the worker disappears and corporations can profitably hire people to do all sorts of pointless things that profit the company, at the expense of the community. In other words, UBI leads to greater wealth disparity.
These problems mean you will never have a “simple” UBI implementation. You won’t get rid of the bureaucracy because you then have to start implementing other measures, like minimum wage laws to combat abuse, and trying to measure real productivity directly in order to catch the freeloaders and destructive enterprises. Pretty soon you’ll end up with something resembling the Soviet 5-year plan system.
The obvious answer is to build terminators to “retire” the people whose jobs are eliminated by AI, or are deemed by the algorithm to be unproductive or superfluous.
A quiet tax free utopia awaits!
Hmmm, let’s see. An open US/ Mexico border and UBI…………………
Oh my God!!!!
>If you think guaranteed income will “free” us to suddenly become a nation of 300 million artists and craftsman, think again.
Worse: it would do exactly that.
Think about what it means to have a society where the value of work is not measured by whether it returns the spent value back to the society with some profit if possible, but simply by whether someone is fool enough to pay.
There will be no need for artists either with mass produced art. This is just obsoletion of humanity.
The race to stay ahead of piracy will guarantee those artists and craftsmen will not remain static.
“So the idea that the blue collar / lower classes will all loose their jobs[…]”
https://www.merriam-webster.com/words-at-play/lose-vs-loose-usage
HTH =]
my reaction to the headline:
Gosh, I hope so!
Historically, technology exists specifically to increase productivity. One person with an ox can plow more than one person by hand. And a tractor instead of an ox, a person is more productive. End result, far fewer people work on farms than a century ago (yes, that was only 1923. and they had tractors back then).
Repeat this for every industry. From textiles to construction. A person operates a machine that does the labor of many. Ideally this increase in productivity yields a higher standard of living.
So why shouldn’t doctors be able to handle more patients in the same amount of time. Or programmers debug or write more software? Engineers design more chips in a smaller area and at a lower power and in less time?
That said, current so-called AI is garbage. What it outputs is basically useless. I can’t use its output to publish whitepapers. I can’t take the code and use it in a production environment. I can’t even effectively use an AI generated test plan because I still have to make sure it really did what it was asked and covers my original requirements.
Ultimately AI is going to be trained to pretend to do its job. Like your laziest and more irresponsible employee. It will be rewarded time and time again for meeting the letter of the requirement without grasping the spirit and without going above and beyond what is required. The only reason AI will have a job is because it’s free, any employee like that you would have fired right away.
Every technology that ends a career path is met with outrage and vitriol. Technological advances can negate entire professions and put people invested in those careers out of their jobs.
Yes, it also frees those people to find other jobs and increase overall productivity, but only after a period of major disruption, often involving mass unemployment and retraining. Imagine spending your best years gaining world-class expertise in a specific field only to have all that expertise be suddenly worthless due to some tech advance. Then imagine someone shrugging their shoulders and saying, “Sucks to be you! Just go find something else to do before you retire in a couple of years.”
Yes, overall, society is better off without leaches like the health insurance industry, but there are enough people invested in their medical coding careers to bend the politicians’ ears and prevent single-payer healthcare from passing. For example…
A lot of people completely disregard when a new technique came to be, how the world were at the time, and how long it took for the technique to proliferate in the market.
Plowing fields with oxen requires that one breeds forth said oxen. A process spanning decades. So it isn’t a particularly fast transition. And a transition that also happened in times of general food scarcity the world over.
The textile loom might seem more impactful.
But building a textile loom requires a hefty amount of manual labor. Fairly technically skilled labor at that. So production of the looms weren’t particularly fast. But yes, this did happen faster than breeding oxen. But still took decades.
AI/ML systems on the other hand.
Well, turns out computers from 10 years ago are somewhat decent at running most modern AI/ML workloads. Yes, it won’t have stable diffusion spit out an image in a second or two, might take it a few minutes, but all things considered that is still insanely fast.
So it is more a question of when someone can train a suitable useful AI/ML system. In some fields, that is almost here. Often times by relying on the work already done by the very people who will get replaced.
In short, a lot of content creators. Be it artists, authors, programmers, or the like. All have effectively dug their own grave. It is only a question of time until someone sorts through this data and make an AI/ML system that is actually competent. And at that point, everyone can effectively already run it.
One way to ease the transition is to require that all AI/ML training corpuses should only contain explicitly authorized data. Since then content creators’ work and copyright licenses (be it proprietary or open source) will be respected in a way that effectively means that these people/organizations aren’t digging their own grave.
Agree ! AI is just a phrase used for something that is a light years away from its true meaning. Its just a bunch of math expressions used and connected in unusual way but doe it so ineffective due its limited and way off primitive learning capabilities and hard to use, that is useless for majority of potential users. NN’s are so primitive like a basic math calculus for ground school. Software in that regard lacks of autofit hyperparameters and has a very poor learning capabilities. That being said on learn and test data set doesn’t even in average surpass 60% accuracy. On some test datasets even fail 100% no matter what hyperparameters are chosen. On test data sets if validating the performance of NN’s, NN should have more that 90% accuracy coz it has learn from it. But if u go and “question” test it (not believing internal confusion matrix, ROC analysis, Calibration plot and Performance curve) coz u want to confirm ANN performance then u get your real results that are completely way off and useless telling opposite the opposite to internal performance tests named above). To make things worse one have to guess hyperparameters. Have u look how many possibilities there exists ? One have better chance to win a lotto as guess proper hyperparameters for NN to fit data. I don’t know where so called phD experts on that AI matter are referring and getting their results, mine are useless. At the end we are using the same god damn software. And there isn’t no decent software developed for data pattern analysis or i haven’t found it yet. Learning AI, ANN, NN capabilities are still way to primitive and software in that matter is way underdeveloped. Train test validate process has to be straight forward with autofit hyperparameters at least for given datasets and near 100% accuracy with relearning capabilities on validate datasets in order for NN’s to fit data and not other way around as published on some sites on net. No such software to my knowledge exists. Developers of AI, ANN’s, NN’s, data mining software are wrongly thinking – its all about data and results and not all about NN software ! No wonder that is hard to use and developed the wrong way. And software should be developed in such way that all engineers from various branches can use it, and should be rewritten completely with user friendly GUI capabilities.
>So why shouldn’t …
Productivity simply for productivity’s sake is missing the point. It carries similar issues as the “paperclip maximizer” case where an AI turns the whole earth into a paperclip factory.
Mind: the need for productivity is to make something consumable by people. Assuming that we don’t massively increase the number of people, or massively increase their consumption of basic resources, the demand is more or less fixed. Therefore any increase in productivity will necessarily put people out of productive jobs since there is no demand to perform more such work.
So what do people do instead? They do the opposite: they start coming up with meta-work which may require a lot of effort, but is ultimately based on consuming and making other people consume more stuff in order to catch the spill-over for yourself. Manufactured demand. A classic example is a person who simply begs money on Youtube for a video of themselves playing a video game – on the point of “supporting the creative arts” – which really contributes nothing but a waste of electricity and hardware. It doesn’t increase the material wealth and sustenance of people, or even replace what was lost – it does the opposite.
The more people live like this, assuming the first point of not increasing total consumption, the less material wealth everyone will have at their disposal. First, it costs something to have the automation which displaces people, so that already removes something from the total amount of resources that can be allocated to people, and then the people themselves waste it in pointless activities that are simply games to earn money, which earns you an allocation of the resources that you are wasting.
Even if we assume that the people won’t do the stupid thing and simply burn resources to generate “work” for themselves: if you replace a person with a robot, instead of improving efficiency, you actually reduce efficiency because you will have the same output – as limited by demand and ultimately by the earth’s carrying capacity – and you have both the robot and the person to “feed”. The person will be idle, but they will be materially worse off.
A person who performs productive work, regardless of being less efficient at it, does not consume resources pointlessly. Even if you replace a thousand workers with one robot, you still have to pay the thousand people to exist, and the robot, so ultimately you come out worse.
Although, since “productivity” is measured in terms of GDP, it does not actually differentiate between making more resources and consuming them. It simply counts whether someone gets paid money for it. That’s how you can increase productivity by neither producing or consuming more – simply by shifting money around faster.
If that youtuber was otherwise going to buy a bunch of physical toys, or commute for an hour and a half in a car each day, it might still be a net benefit to society. I feep the same way about NFTs. Sure they’re “stupid” in every way, but if rich jerks are buying NFTs instead of Bentleys to satisfy their urges for conspicuous consumption, the planet still benefits.
A recent article (https://www.theregister.com/2023/02/06/uh_oh_attackers_can_extract/) covered research showing that generative AI was capable of regurgitating degraded but clearly recognisable copies of images from the training data. I think results like this make it very hard to argue that the a trained ML model is not a derivative work of the training set.
It will be interesting to see which way the courts go here because there doesn’t seem to be much difference between feeding copyright material into an AI and getting a slightly degraded copy of the original back, and feeding copyright material (say a movie) into a different mathematical model (say the H.264 algorithm) and getting a slightly degraded copy of the original back.
The entire point of how Stable Diffusion works is by attempting to return the original images back from noise. That alone says it must retain information from the original images – the output is simply a scramble of the data, not generated out of nothing.
Not to give anyone ideas but copyright has one big loophole and it’s cleanroom design.
https://en.wikipedia.org/wiki/Clean_room_design
You put one AI on dissecting copyright protected code and boiling it down to the essential elements. Then you put another AI on building up from those requirements. The second AI must have no knowledge of the first, or the software that is being reverse engineered.
And then string the two AIs together into a neat program where one simply writes a single line in the console to make a “copy” of program A, but have program B be in whatever language one needs.
Can make it a GUI program just as well.
However.
Even cleanroom practices can be copyright infringing.
It isn’t just about it being two individuals that has no direct relation to each other.
Secondly. Legislators can argue that any conglomeration of automated systems will be regarded as copying, regardless if the individual parts of that larger copying system are in themselves not made with that intent.
So it isn’t really a loophole that AI can use.
And honestly, AI systems generating code today are likely infringing. However, the law has little consideration for these sorts of systems yet.
Only time will tell what side of the copyright fence AI content generation will fall down on.
But for the benefit of open source projects and content creators alike, it better fall on the infringing side of the fence if it used any unauthorized data in its training corpus.
What I have seen so far seem to be training on bad examples. I think we are going to have a case of severe inbreeding, sitting with moron AIs three generation from now, each training on the gobblygook from the previous generation
The best role for current AI is QA. Let it walk through the UI process, compare output to the previous version, report whether differences are expected and if not, assist with debugging via breakpoints tracking state. Promotion to production (only after human review) updates documentation as well as deploying code.
And Paralegal of course for the explosion in liability lawsuits as chatbots are held to the same standards as human employees…
Ohh dear, I think it is totally the opposite! AI will code whatever requirement you need, but do you think that quality evaluation will be left to bots? of course not, we will only trust humans for quality inspection !
My programming friends are using these tools and they say they’re magic. Huge increase in productivity – it bypasses the googling step of “how do I accomplish ABC in XYZ language/library”. And I’m not talking about noob developers barely scraping by with the AI – these guys are 6-figure salary developers who can already do everything under the sun.
There will be some job loss but I think more from the “website template churn out” sector. Not “all” jobs that’s for sure. The AI doesn’t understand the context of more complex programs, even though it can generate a useful snippet. You need a human to translate business need or manager request into a contextualized comment.
And you want (but maybe not NEED) a human who can hold the overall logic flow of the program in their head. A penny-pinching company could probably get by with an intern and AI but the code will eventually become spaghetti and they’ll keep digging themselves deeper.
carefull with the hype
every technology that has been developed has been used for the most violent evil and depraved
uses that can be contrived,as near as I can tell,just cause
then it is monitised,good times
The evil done with condoms.
Drug smuggling comes to mind.
This article contains a lot of interesting statements.
Firstly “It is not entirely unreasonable to fear that the last invention we need to make might just be the last invention that we get to make.”
Here one has to ask, Do we “NEED” to make this? Just because one can doesn’t mean one should.
“Generally, studying copyrighted code and then rewriting your own code is not an infringement of the original copyright.”
Doesn’t fall in line with “cleanroom” coding practices.
If a given individual studies a copyright protected software, it is often copyright infringing of them to make a piece of software that does the same thing. However, it isn’t usually copyright infringement for them to describe said software and let someone else program it for them.
Given this fairly well explored practice, then it wouldn’t be unreasonable to argue that these machine learning systems are copyright infringing. (a notable example of this practice is the first IBM PC BIOS clone, that IBM tried to sue, unsuccessfully since the person making the clone didn’t ever see the machine code they were mimicing.)
Though. The biggest issue with AI/ML is how fast it is.
A human has to wisely consider what sources to even invest their time on, since time is a very limited resource for a human. AI/ML systems largely don’t.
And likewise are humans greatly limited in their expressive capabilities. Something AI/ML systems yet again don’t really have.
A brief look at ChatGPT and we can trivially see that OpenAI likely has the computing capacity to dwarf the learning corpus in a matter of days. And considering that said corpus is effectively all written text publically available on the internet, then that says something.
To a degree. AI/ML systems aren’t just a step forward. In comparison to human content generation, then AI/ML systems are the atom bomb compared to the bow and arrow.
That market forces pushes these systems forth, with a complete disregard for the consequences, let alone the lack of respect towards the content creators who’s material they abuse without the slightest care. All to see who can achieve larger market share and in turn venture capital and potential future profit. This trend is frankly concerning.
Current AI knows many things and understands nothing, this may be because a model built using statistical methods is not capturing the layers of value that humans attribute to pieces of knowledge and how we value the relationships between pieces of knowledge, therefore there is the risk that the entire AI field is “barking up the wrong tree” and will never be able to produce a genuine form of intelligence using current strategies. At best we may just end up with a digital “idiot savant”. One area that may benefit is in the area of rote training of human brains, using direct feedback a system can know what stimulus you need to re-enforce a given response. e.g. language training, as how we learn language casually as a child does seem to use a statistical approach that relies on vast amounts of input that is very repetitive, however this is very much seperate from how we learn to think, to reason. I’ve personally interacted with humans who had well developed language functions but very little in the way of actual intelligence or thinking ability, and obvious jokes about public figures aside it was a real eye opener, that a damaged or malformed brain can produce a verbose and social chatterbox that leaves you with the spooky feeling that the lights are on but nobody’s home. This is sldo how I see current AI system.
Can’t be much worse than the current path. Cheap storage and memory have really lowered the standards. Now with higher-end hardware getting so cheap, the same thing is happening in the embedded space. I’m currently looking at a $400 motor controller which attempts to replace an existing $70 analog motor controller.
The new one has a 32bit processor with MEGS of code that simply reads an analog position value, performs a crude PID loop and outputs to a DC motor. This wasn’t designed by an inexperienced engineer either. This was an experienced coder who just copied some open-source crap, tweaked it, shipped it.
How is “AI-Assisted Code” any different?
Damn A.I., stealing our jobs, stealing our women!
From the programming epigrams of Alan Perlis:
https://www.cs.yale.edu/homes/perlis-alan/quotes.html
63. When we write programs that “learn”, it turns out that we do and they don’t.
93. When someone says “I want a programming language in which I need only say what I wish done,” give him a lollipop.
Another epigram, not attributed to Perlis, is “if you have N stakeholders, you’ll end up with an N-pass compiler.”
As it turns out, knowing syntax, algorithms, libraries, frameworks, and the general literature of code isn’t the hard part of programming. Deciding what the program should do, and organizing those decisions into mutually-consistent descriptions at different levels of granularity, is the hard part.
The hard part of that is making tradeoffs between competing alternatives.. deciding what to emphasize and what to give up.
Committees of marketers, lawyers, finance people, and middle managers with private fiefdoms don’t do that. Nobody gives anything up, they just create ‘specification documents’ that bury all the conflicts in ambiguous language that ultimately forces the lowest-level programmers to make the decisions.. piecewise and in isolation from each other.
The result is Windows Vista.. pieces that work indivdually but can’t be made to work together. Years of big promises that are gradually abandoned, eventually yielding a lackluster product that isn’t compelling on its own.
Giving those committees AI that can write code will only speed up the process of burying the failures to agree, and of letting one group sabotage another in ways that can’t obviously be traced back to anyone.
Interpretation of code performance will be a human endeavour for the foreseeable future. I’m sure bug anlysis or debugging will become an a.i. forte at some point but thats the uncanny valley.
Artificial Intelligence (A.I.) has the ability to automate many tasks and processes, which could potentially impact certain job sectors. However, it is important to note that A.I. is not a single entity, but rather a collection of technologies that can be used to augment human capabilities and increase efficiency.
While A.I. has the ability to learn and automate certain tasks, it is not capable of taking all the jobs. A.I. is currently being used to augment human work in many industries, such as healthcare, finance, and manufacturing. A.I. can help humans to perform their jobs more efficiently and effectively, by taking on repetitive or time-consuming tasks, leaving humans to focus on more complex and creative tasks.
Additionally, the development of A.I. requires significant expertise and collaboration between researchers, engineers, and other professionals. Therefore, A.I. is not capable of “stealing” code or taking all jobs, as it still requires human intervention and oversight to function effectively.
In summary, while A.I. has the potential to automate certain tasks and processes, it is not capable of taking all jobs. Instead, it can be used to augment human capabilities and increase efficiency, leading to the creation of new job roles and opportunities.
did not realize the name said ChatGPT but while reading I thought hmm this looks familia haha
ChatGPT likely can’t answer anything in regards to the impact of AI.
Not because it is an “AI”, or that it is created by OpenAI (a company that exclusively works on proliferating such technologies.)
But rather because back when its training data were collected, practically no one on the internet talked much about the impact of these types of systems. Especially not with any meaningful information at hand.
At best, ChatGPT’s answers on the topic is just an agglomeration of the internet’s speculations about AI.
At worst, it is a tailored output. (However, I somewhat doubt this to be the case, but I wouldn’t be surprised…)
Since it isn’t remotely hard for OpenAI to have had foresight to train away any clear negative opinions about AI and machine learning. Since most questions about AI and its impacts are rather obvious. (It is after all in OpenAI’s best interest that their own product doesn’t talk negatively about itself and the field at large.)
And it isn’t like OpenAI has actively worked on ensuring that ChatGPT doesn’t answer everything. In an oversimplified way to say it, they have tried giving it a basic moral compass. (that users likes to circumnavigate.)
There is clear conflicts of interests to say the least.
Go ahead, make my day and replace me. At this point, with no interesting projects, i’m up for digging dirt. No AI wants to do that.
Like every C++ introductory tutorial anywhere that opens with: using namespace std;
It’s a neat tool, I’m currently using ChatGPT to make me a PHP framework (just for fun). It throws out the ideas and I make suggestions or critiques and it modifies its ideas. It’s like some kind of weird pair programming.
Some of the suggestions it makes are completely ridiculous, but it has lots of other ridiculous facts to back up its mistake.
No it not skynet just unplug the device.
“There are plans eventually provide references for code suggestions[…]”
+”to”
HTH =]
So a “programmer” writes software, but isn’t a good programmer, the software by definition isn’t any smarter than the programmer, now add the “None of us is as dumb as all of Us” factor for AI and you have a clown writing code.