the Bubble is already busting and the lawsuits with some pretty big players.. I think one of the main writer law suits has people like steven king, who's work was copied without permission for 'training'., the artist side includes Disney in one of the things i'm agreeing with the company with..
If anyone here remembers back (and knows about it) I think it was just after the days of newsgroups and in the days of the early forums.. I can't remember who the companies were but was I think was to do with Netscape.. anyway, two forums. they had illgeal content. One kinda got away with it (though had to deal with stuff) because they DIDN'T have any mod controls and was open with that, therefore anything went. The other HAD mods. the one which had mods had to pay alot of money cause, as they had mods, they should have stopped the illgeal stuff. Part of the suit which Disney is one of the people in is to do with stealing stuff. a fake AI 'creates' a drawing by stitching by using preditice data with a model sheet of what it should look like. People are knocking off copyrighted characters in pretty damn close (but with alot of flaws) but close enough that is's copyright infringement. Disney says they should be able to put in stuff which when you type in "Draw Disney's Mickey mouse" it won't. They say they can't control it. Fine.. So the side said, so you can illgeal make child porn with it? "Oh no, we have filters to stop that".. so basically, they only mod stuff which they want controlled. And as there has already been presidence (erm.. can't spell the right word.. has already been tested and set up) before, it's pretty open and shut when they finally get through the system.
But the systems work pretty much like Predictice text on a phone. You set up a basic 'brain' structure and either feed it data manually, or let it 'try it'.. the try it doesn't work for typing or 'art'. in them, it has to copy and paste basically. A very simple way to show it is If you know about cryptography.
If you type 'th' what is the most common third letter? e. it's gonna be the word 'the'.. not always but most likely. a completely sealed system needs you to type that like that. Looking at what I've typed in this post alone, pretty much every time it have typed 'th' at the start of a word, it's followed by a e. so it would see the change of the third letter being a e is very high, and suggest that. If you agree, it gets reinforced so it'll automatically go for that option more. If the most common was the word 'that', it would instead suggest that. most phones seam to give a most common and the next two common.. for th.... mm.. The would be most common, then you would have words like than, that, this, these, those etc. It would record what you typed and in a database store the most common at any given point. Alot of cross references. After a while, it can also build up word groupings. the most common word after 'good' is probebly 'bye' so it will suggest 'good bye' when you start to type 'goo'. If you have a modern system, you can handle the processing of doings millions and millions of checks like that at one time and it'll learn more then just 2 word groups, but it can learn more and more to a stage where you can give a very basic line and it'll pump out the most common way of doing that. Once you have the core system and the interconnections, it can just add in it's own connections, like a normal brain (thus neutral network) to a degree that you can say 'write a story about a rat' and it'll be able to put out the most common 1000 words or more which happens to be the most common story about a rat.
Now, every time someone enters data in, it learns more, so the 'most common' changes. so you get a bit of varitation. Also, to give the false impression of 'creativing', you get it to every so often just randomly add something and pretend that was entered, thus you will get something different. However, often these random bits are what are called 'hicups' because it'll just not work at all as its' completely wrong. When you write a decent one, you put in inforcers to give more weight to the 'correct' path. Just like normal learning. You reward what's good, and punish what's bad.
Back in the fun days of A-life simulators (I really need to get back to my one I was toying away with) you pretty much want to create something that can pass a 'realistic'. In a basic way, you have a brain set up with 'needs'. For a basic life form you want to emulate, you have something like 'Hunger', 'Tireness', 'Cold' to give a couple of basics. The 'creature' then has a pleasure and pain variable. the more the hunger increases, the more the pain goes up. If the hunger goes down, the pain goes down and the pleasure goes up. There for, it learns that when it is hungery, it needs to decrease this value. You can then tell it what Food is. Give it an object you set as 'food'. IF it interacts with it in a way, we shall call 'eating', and that decreases the hunger, it'll start to learn what to do. It learns 'eat' 'Food' decrease 'hunger', increase pleasure. To give a real like brain system, you have it 'forget' things over time and randomly, so it'll need to eat food enough to keep remembering it what it does. You can also fake a life cycle where as it gets older, it forgets quicker.
That is very basic and I'm not advanced enough to go into the full full side, but that's basically it. It's pretty simple but when the first tries at the system were being done in the 60s (well, the first major one), it was found out that you just couldn't store enough data and process it well enough. The Human brain is of such size that.. well.. You just could't store it. When you think of how much space it takes up to store 1 letter. sure, you think "That's only 1 byte', Think. Back in about 1996, I got a amazingly large secondary harddrive. that was 6GB. the main drive was 1GB. 6GB.. that's 1024 Megabytes etc.
6gb is enough to store '6,442,450,944' characters.. sounds alot but the average number of characters on an A4 page is 1,800. so 6gb can store 3,579,139 pages of text. Though I'm not 100% sure if that includes spaces (ignore compression for a bit, that gets complex). Say the averge book has 300 pages. That works out to 11,930 books. Starts to look a bit small eh? In 1948, the first form of computer memory, as RAM was created. the Vacumm tubes could store 1024 bits of information. An audio cassette, like for a spectrum game, could store 135MB of data. In order to have the data storage and the processing power to well. process the amount of data needed for a complex system... well.. its why serious neutral network research didn't fully take off until the late 70s. At which time, it's predictive text gave us 'millennium shrimp and hand'. Such random word groups (yep, I take it most people know who got that data out of a system and used it).
anyway.. I think that's more than enough of the basics.. more advance stuff, you gonna need to really check some decent research papers and stuff which.. sad to say, some go over my head, others I just find really dull.. I prefer more abstract research papers.. more like Pseudo code cause they are FAR more useful in every day life. I wonder if they still teach people to pseudo code things first these days.. They should