

Why?
I use Linux. This means everyday I use software developed by Google, Microsoft, IBM, Oracle, the US military and the NSA.
It doesn’t really matter who developed or contributed so much as who benefits.


Why?
I use Linux. This means everyday I use software developed by Google, Microsoft, IBM, Oracle, the US military and the NSA.
It doesn’t really matter who developed or contributed so much as who benefits.


I’m not sure that’s right. It’s not like they’re giving money to brave. The library itself isn’t tainted, and using it doesn’t benefit brave or the CEO.
Further, simply supporting a thing doesn’t make that thing a moral proxy for the supporter. That path leads to an infinite regress of bad moral choices with nothing being moral.


Whoah, I never said I wasn’t interested in the exchange, only that I wasn’t interested in the topic.
As someone who’s extremely insistent that it’s grossly improper to make any form of inferences beyond what is literally stated, I’m shocked you would make such a leap!
I think you’re persistently confusing me with someone else. I perfectly understand your point, and have never had any doubt about what you intended to say. I never even disagreed with you on the topic.
I clarified someone else’s point to you, and you started explaining to me how they made unreasonable assumptions, which is what I disappeared with.
Intellectual property laws apply to open and closed source software and developers equally. When you make a statement about legal culpability for an action by one group, it makes sense to assume that statement applies to the other because in the eyes of the law and most people people in context there’s no distinction between them.
No one is unclear that you were only referring to one group anymore. That’s abundantly clear.
My point is that you’re being overly defensive about someone else making a normal assumption about the logic behind your argument. And you’re directing that defensiveness at someone who never even made that assumption.


I’m really not interested in the topic. I’m talking because I explained what someone else meant and you started responding as though that was an opinion or argument I was making.
That’s not “applying the argument consistently”, it’s removing context, overgeneralizing the argument, and applying a strawman based on a twisted version of it.
It’s really not.
It’s not unreasonable for someone to think “developers who use copy written code from AI aren’t liable for infringement” applies to closed source devs as well as open, and to disagree because they don’t like one of those.
It’s perfectly valid for you to also disagree and say the statement shouldn’t apply both ways, but that doesn’t make the other statement somehow a non-sequitor.


Alright. I didn’t see any gotchas or argument, and didn’t make the comment.
That being said, reading the context I assume you’re referring to, it hardly reads like anything more than talking about the implication of the idea you shared.
Disagreeing because applying the argument consistently results in an undesirable outcome isn’t objectionable.


I don’t really see it as a divergence from the topic, since it’s the other side of a developer not being responsible for the code the LLM produces, like you were saying.
In any case, it’s not like conversations can’t drift to adjacent topics.
Besides, closed-source code developers could’ve been stealing open-source code all along. They don’t really need AI to do that.
Yes, but that’s the point of laundering something. Before if you put foss code in your commercial product a human could be deposed in the lawsuit and make it public and then there’s consequences. Now you can openly do so and point at the LLM.
People don’t launder money so they can spend it, they launder money so they can spend it openly.
Regardless, it wasn’t even my comment, I just understood what they were saying and I’ve already replied way out of proportion to how invested I am in the topic.


I believe what they’re referring to is the training of models on open source code, which is then used to generate closed source code.
The break in connection you mention makes it not legally infringement, but now code derived from open source is closed source.
Because of the untested nature of the situation, it’s unclear how it would unfold, likely hinging on how the request was formed.
We have similar precedent with reverse engineering, but the non sentient tool doing it makes it complicated.


Just for more clarity: they workshoped for ideas on how to improve clarity and accessibility from some editors at an event. They did some small experiments, and they then developed a plan to trial some of them and presented the plan to a wider audience for feedback. After they got feedback they decided not to.
It’s not quite the editors pushing back on Wikipedia. Or rather, it’s not the “rebellion” people want to make it out to be.
https://www.mediawiki.org/wiki/Reading/Web/Content_Discovery_Experiments/Simple_Article_Summaries
It rubs me the wrong way when the process going how it should go gets cast as controversial and dramatic. Asking the community if you should do something and listening to them is how it’s supposed to go. It’s not resistance, it’s all of them being on the same team and talking.


Eh, that’s not quite original research. There are plenty of other examples of images and sound files created for Wikipedia. A representative example isn’t research, it’s just indicating what something is.
The Wikipedia article on AI slop and generative AI has a few instances of content that’s representative to illustrate a sourced statement, as opposed to being evidence or something.
It’s similar to the various charts and animations.


Yes, but…
https://en.wikipedia.org/wiki/Wikipedia%3ADatabase_download
That’s because viewing the page uses server resources, as done API access. If you want the data you can download the database directly.


There’s hardware required to shunt the display out the USB port and since it’s not a super in demand feature they usually don’t implement it. As such the software for looking nice while doing it isn’t as developed.
But yes, it’s been in developer settings for years, and was usable if your hardware supported it.


Yes. And now it’s native in all android! Samsung helped make it!
It’s good when things get better.


Yeah, the conventional ones still draw a good chunk of power, and they’re not clean but they’re not dirty. Same as how a grocery store isn’t good for the environment but you’re not looking at them first for places to clean.
They tend to be boring, and are usually not a public thing but just something owned by a company to house their computers. The only reason I know about the ones near me is I used to work at one and people would move jobs to or from other ones. (As an aside, a datacenter is a great place to nap if you like white noise).
For a sense of scale:

This is the site of an open AI data center. The yellow square is about 1 square mile and mostly encompasses the area they plan to/have filled.

That angle shows more build out.

This photo has two normal data centers in it. The yellow square is also about 1 square mile. I’ve highlighted the data centers in red. One is to the left of the square near the middle, and the other is down from the right side near the big piles of what looks like rocks. (Spoilers: it’s rocks. They make asphalt). The sprawling complex in the upper right is a refrigerated grocery store distribution complex. The middle on the other side of the block from the asphalt is a coal power plant.
Of the things in this picture, I’m most upset about the giant freeway interchange. Coal is shit, but it’s a modern plant so it’s not belching soot, just co2, and the utility is phasing it out anyway. The grocery traffic is mostly dead except between the hours of midnight and 7am when they do restocks.
I can hear the freeway if I go outside.


I think the part you’re missing is that 1) it’s my community too 2) they’re not talking about AI data centers, or new data centers or anything like that, they’re petitioning to ban all data centers, and 3) we have multiple data centers in the city already that no one complained about until AI data centers became a thing people felt concerned about.
There’s a major difference between the 2 square mile hyper scale AI data center that requires a nuclear reactor and a full water treatment plant to cool and the 2 acre data center that’s air cooled and has no more ground pollution than any other parking lot and essentially a warehouse.
The state government has two in the city, at least, for processing electronic tax records, applications and hosting service sites. We have a few national insurance companies that need to process all the things they process. A research university, and a web hosting company round out the list of ones I know about.
This is my entire point about why sometimes it’s really necessary to point out that what someone is referring to is only a small part of what the words they’re using describe. The language being imprecise doesn’t matter until someone proposes a law outlawing chemicals, shuttering all data centers, or banning AI.
LLMs are problematic. My fancy rice maker isn’t.


I take your point. :)
It’s worth mentioning in my opinion though, because if someone were to say “we should ban chemicals” it’d be worthwhile to point out what that actually means.
I don’t actually think the broadness of the category is intentionally abused, it’s just that it’s an incredibly common thing to remove anything from the AI category that’s explicable.
I feel slightly more hanlons razor about it since there’s people in my city talking about and petitioning on the popular notion of banning all data centers from the state, and how it would be awful if s data center came here. I know what they mean, but it’s not what they’re trying to get the law to do, and our city already has six data centers I know of off the top of my head. The language drift is fine, but when it starts to conflate with policy it’s another issue.


https://blog.mozilla.org/en/mozilla/leadership/mozillas-next-chapter-anthony-enzor-demeo-new-ceo/
The root of the current discussion.


A conservative guess would be around 60 people.
https://bugzilla.mozilla.org/describecomponents.cgi
You can click around and see the bug reports they’re working on. There are a few, to say the least.
https://www.firefox.com/en-US/releases/
This is a way to see what’s in each release. The ones on the left are major releases and tend to have bigger features, and the others tend to be bug fixes.
Web browsers start with core functionality that’s very complex. Then you tack on that they’re being used for things like banking, and managing the critical details of people’s lives. That means security galore, which is hard and constant. Then you have ad people, who are also something that’s hard to defend against.
Then there’s the constant flood of new features you have to implement to keep up with Google.
Chrome has 1,000 to 4,000 people working on it. Mozzila employs about 700 to work on firefox, with maybe 1,000 additional open source developers.
My initial guess was very wrong.


It’s less a vague umbrella and more an academic category. It just feels odd to call it vague in the same way you wouldn’t call “chemistry” vague, despite it having applications ranging from hand soap to toxic waste.


Yeah, ocr is a type of AI. The big advantage of modern techniques is that it can factor in context a bit better. It’s the same principle but a different mechanism for how you know a red hexagon with S__P on it says stop, even if the sign is dented, a letter fully fell off, it’s raining and dark.
It also means it’s sometimes wildly inaccurate, like in cases where it’s just so much more likely that it said something else. Like how on a bright sunny day, with perfect clarity, and a crisp new sign with extra good visuals, you’ll hit the breaks for a sign that’s a red hexagon that says §¥¢¶. It’s just very unlikely that that would coincidentally be on a red hexagon near the road, so it’s more likely you saw wrong and it was actually the normal thing.
It’s that, plus other factors. The regulations are more lenient, it’s easier to get a more efficient engine in with more mass to work with, it’s easier to pass safety ranking checks, and it’s easier to put comfort features in that consumers want.
Putting a large crumple zone on a compact isn’t as easy as putting one on a giant truck.
(Note this isn’t saying big cars are more or proportionally more efficient , but that the efficiency advances they’ve made over the years are easier to implement in a large engine)