Post

Replies

Boosts

Views

Activity

Reply to Create ML fails to train a text classifier using the BERT transfer learning algorithm
We tried your macOS/Xcode but still can't reproduce your error, unfortunately. Okay, well thanks for trying. You'd just need to set language: nil to get the behavior of latin/automatic. But then how would I select the CJK/Automatic or Cyrillic/Automatic options? As far as I can tell, the CreateML framework doesn't distinguish between language and language family, so it's unclear how the Create ML app is able to make this distinction or what the distinction actually means. From the WWDC video I linked to previously, it sounds like there is one model per language family, which would mean there's essentially no difference between choosing Latin/Automatic and Latin/English, or CJK/Automatic and CJK/Japanese. Can you clarify what the difference it? Possible theories include: The language selector menu is just there to reassure users that they made the right choice of family, but it makes no difference under the hood. Picking a specific language changes how the data is tokenized before training. There are in fact separate BERT models trained on monolingual data, as well as models trained on multilingual data.
1w
Reply to Create ML fails to train a text classifier using the BERT transfer learning algorithm
Hi @Frameworks Engineer thanks for looking into this. Yes, here are the relevant version numbers: Tahoe Version 26.1 (25B78) Xcode Version 26.2 (17C52) Create ML Version 6.2 (175.7) MacBook Pro (M1 Pro, 32 GB) I did wonder if maybe I don't have the model assets on my machine, but note that I can successfully run the BERT option using the framework directly (with the above code). Although it's a little less convenient, I could simply build the models using the framework, but to do so I was hoping to get confirmation on two points: Is there a specific way to select the Latin/Automatic option to get the multilingual model which I can train with multi-language training data? If not, should I simply set the language to e.g. .english to get the multilingual model. If I do so, does that mean that all languages will be tokenized with the "wrong" tokenizer, or could there be some other problem in doing this? Many thanks for any guidance you can provide.
1w
Reply to Create ML fails to train a text classifier using the BERT transfer learning algorithm
@tjia Thanks for looking into this. Here's some small example data that you can use to test it – I don't think the data really matters – it seems to fail whatever I put in. Save this as a JSON file and create a new Text Classifier project with default settings and the BERT algorithm. For me, the training stops immediately after the feature extraction phase with no error other than "Training stopped". [ { "text": "Pinus contorta", "label": "animalsAndPlants" }, { "text": "Rabbit", "label": "animalsAndPlants" }, { "text": "Brochoadmones", "label": "animalsAndPlants" }, { "text": "Zebra", "label": "animalsAndPlants" }, { "text": "Oak", "label": "animalsAndPlants" }, { "text": "Campanula rotundifolia", "label": "animalsAndPlants" }, { "text": "Black wood pigeon", "label": "animalsAndPlants" }, { "text": "Colorado potato beetle", "label": "animalsAndPlants" }, { "text": "Corvidae", "label": "animalsAndPlants" }, { "text": "Honey bee", "label": "animalsAndPlants" }, { "text": "Pablo Picasso", "label": "artAndDesign" }, { "text": "Paul Cézanne", "label": "artAndDesign" }, { "text": "Marcel Duchamp", "label": "artAndDesign" }, { "text": "Proto-Cubism", "label": "artAndDesign" }, { "text": "Vincent van Gogh", "label": "artAndDesign" }, { "text": "Cubism", "label": "artAndDesign" }, { "text": "Rococo", "label": "artAndDesign" }, { "text": "Art", "label": "artAndDesign" }, { "text": "Interior design", "label": "artAndDesign" }, { "text": "Typography", "label": "artAndDesign" } ]
1w
Reply to Foundation Models: Is the .anyOf guide guaranteed to produce a valid string?
@Frameworks Engineer Could you provide the version of your operating system where this issue was seen? I'm using the iOS simulator (iOS 26.2) running on Tahoe 26.1. I could try updating to Tahoe 26.2, if you think that would help? I have not tested this on device. @Apple Designer The best work-around I can offer for now is just add a verification in your tool call itself. Thanks, yes, I experimented with this, but I found that the model sometimes just persistently keeps calling the tool with the same invalid argument, sometimes appearing to get stuck in an infinite loop. I've also tried listing the valid section names in the instructions, but even with that I've observed that the model will still try to call the tool with invalid arguments. (This new world of non-deterministic engineering sure is an adventure!) More general question (assuming there was no bug)... would you recommend listing the valid arguments in the instructions? Or would that be redundant because the valid arguments are listed in the schema definition (which I presume is fed to the model behind the scenes). In other words, how does anyOf actually work: anyOf lists all the options in the schema that's presented to the model. anyOf constrains the generation of a string at prediction time. anyOf does both of the above. I'm asking because I presume it's important that the model knows all available options in advance of the prediction of an input argument. So, if anyOf does not do (1), then I guess it would be important to list the valid options in the instructions.
2w
Reply to Foundation Models: Is the .anyOf guide guaranteed to produce a valid string?
Thanks – I didn't even think to test that because I assumed it must be something to do with creating a dynamic schema at runtime. Do you have any ideas about a workaround? Is there some other way of defining a schema at runtime with guided enum-like behavior? Or do you have any ideas about a timeframe for the bugfix (I mean, is this an iOS 26.3 type thing, or an iOS 27 type thing)?
2w
Reply to Foundation Models: Is the .anyOf guide guaranteed to produce a valid string?
I'm dealing with encyclopedic articles, and quite often the model will request the "Introduction" section, even though no such string appears in the sections array. Sometimes it also hallucinates sections that it thinks ought to exist based on the prompt. While I was debugging this, I constructed a simpler toy example. It might be easier if we talk about that instead. Here's a playground I made to demonstrate the issue: import Playgrounds import FoundationModels #Playground { struct CityInfo: Tool { let validCities: [String] let name: String = "getCityInfo" let description: String = "Get information about a city." var parameters: GenerationSchema { GenerationSchema( type: GeneratedContent.self, properties: [ GenerationSchema.Property( name: "city", description: "The city to get information about.", type: String.self, guides: [.anyOf(validCities)] ) ] ) } func call(arguments: GeneratedContent) throws -> String { print(arguments.generatedContent) let cityName = try arguments.value(String.self, forProperty: "city") let cityInfo = getCityInfo(for: cityName) return cityInfo } func getCityInfo(for city: String) -> String { switch city { case "London": return "Some info about London..." case "New York": return "Some info about New York..." case "Paris": return "Some info about Paris..." default: return "Unrecognized city!" } } } let citiesDefinedAtRuntime = ["London", "New York", "Paris"] let tools = [CityInfo(validCities: citiesDefinedAtRuntime)] let instructions = """ You are a travel guide. Your job is to pick a city for the user to travel to based on their requirements. Once you've picked a city you should provide some information to the user about the city and why it's a good choice. To help you, you can use the getCityInfo tool to get information about a city. """ let session = LanguageModelSession(tools: tools, instructions: instructions) let response = try await session.respond(to: "I want to travel to a big city in China") } When I run this, it usually tries to request info about Beijing (the generated content is {"city":"Beijing"}). Or, if I change the prompt to "I want to travel to a big city in Japan", it will try to request info about Tokyo, etc. You might need to run it a few times to reproduce the issue. My understanding is that this should not be physically possible with guided generation. So, I'm wondering if I've set up the GenerationSchema correctly?
2w
Reply to Foundation Models: Is the .anyOf guide guaranteed to produce a valid string?
Thanks, @Apple Designer. I greatly appreciate any help you can give me. I think this is not the issue because I initialize the Tool at the same time I initialize the session, and the article/sections are not supposed to change for the lifetime of the session. So, I do this: let tools = [SectionReader(article: article, sections: articleSections)] let session = LanguageModelSession(tools: tools, instructions: prompt) So, I believe it should be fine for the parameters property to be computed at initialization time (the sections are known at initialization and do not change).
2w
Reply to Defining a Foundation Models Tool with arguments determined at runtime
I found a simpler way to set up the tool with an argument that's defined at runtime: struct CityInfo: Tool { let validCities: [String] let name: String = "getCityInfo" let description: String = "Get information about a city." var parameters: GenerationSchema { GenerationSchema( type: GeneratedContent.self, properties: [ GenerationSchema.Property( name: "city", description: "The city to get information about.", type: String.self, guides: [.anyOf(validCities)] ) ] ) } } However, the LLM will still try to generate cities that are not valid. For example, the model will happily generate content like {"city":"Tokyo"} even if "Tokyo" is not in the validCities array. Have I misunderstood what .anyOf() is supposed to do? The documentation says "Enforces that the string be one of the provided values.", but in my testing this is not true. Is this a bug, or is .anyOf() just a strong recommendation rather than a guarantee?
Jan ’26
Reply to Guidance on implementing Declared Age Range API in response to Texas SB2420
It will hurt some Texas users, but not the rest of the world, which is very important. I agree – one of my main concerns is how these laws will impact on all my other users. Should we terminate the app ? You could do that, but it's not a very good user experience. I plan to present some simple messaging that directs users to an Apple support article. Here's a quick sketch of how I'm currently planning to handle this across a few different apps (in SwiftUI). I would appreciate any feedback on this approach, from either a technical or legal standpoint. In my main App struct, I will branch into a new ContentViewWithAgeGate view for iOS 26.2+. WindowGroup { if #available(iOS 26.2, *) { ContentViewWithAgeGate() } else { ContentView() } } ContentViewWithAgeGate acts as a wrapper around ContentView and perfoms the checks: import SwiftUI @preconcurrency import DeclaredAgeRange @available(iOS 26.2, *) struct ContentViewWithAgeGate: View { @Environment(\.requestAgeRange) var requestAgeRange @State private var isAgeRestricted: Bool = false var body: some View { if isAgeRestricted { AgeRestrictedView() } else { ContentView() .task { isAgeRestricted = await determineIfUserIsAgeRestricted() } } } private func determineIfUserIsAgeRestricted() async -> Bool { let isEligibleForAgeFeatures = try? await AgeRangeService.shared.isEligibleForAgeFeatures guard let isEligibleForAgeFeatures, isEligibleForAgeFeatures == true else { return false } guard let ageRangeResponse = try? await requestAgeRange(ageGates: 18) else { return true } switch ageRangeResponse { case .sharing(let range): guard let lowerBound = range.lowerBound, let ageRangeDeclaration = range.ageRangeDeclaration else { // No lower bound or no declaration information; prevent access return true } if lowerBound >= 18 { // User is an adult switch ageRangeDeclaration { case .selfDeclared, .guardianDeclared: // Insufficient level of evidence; prevent access return true case .checkedByOtherMethod, .guardianCheckedByOtherMethod, .governmentIDChecked, .guardianGovernmentIDChecked, .paymentChecked, .guardianPaymentChecked: // Sufficient level of evidence; permit access return false @unknown default: // Unknown AgeRangeDeclaration value; prevent access return true } } else { // User is not old enough; prevent access return true } case .declinedSharing: // User declined to share age info; prevent access return true @unknown default: // Unknown response value; prevent access return true } } } Then, AgeRestrictedView just presents some general information: struct AgeRestrictedView: View { var body: some View { VStack(alignment: .center, spacing: 30) { Text("Access to this app is age-restricted due to local laws in your state or territory.") Text("Please verify your age with Apple and allow this app to access your age information.") Text("For further information, please refer to the following Apple support article: https://support.apple.com/en-us/122770") } .multilineTextAlignment(.center) .padding() } }
Dec ’25
Reply to Guidance on implementing Declared Age Range API in response to Texas SB2420
(0) check if iOS is 26+. Otherwise, proceed without any test (because we cannot do them) Yep, agree. In fact, we specifically need to check for iOS 26.2. in (1), which import to use AppStore.ageRatingCode ? You just need to import storekit to access that. However, as noted below, I'm considering a different option. in (2), if UIKit and not SwiftUI, need the in parameter Indeed you do. Annoyingly, you also need the in parameter if you're putting your code in some class (e.g. some kind of age manager class), and it's not clear to me what you need to pass in (assuming you have a SwiftUI app). where should parental control be tested ? In step (5) ? I honestly don't know at this point – I haven't gotten to that point yet, but as noted below, I'm looking for ways to avoid that side of things. where to deal with change in user's age or repudiation (as required by law if I read properly) My assumption is that if you check the age on every launch, then this should take care of itself. But I could be wrong. what happens if the requests in await do not respond ? Is there some type of timeout, to avoid user being locked in waiting ? I'm not sure – I'm assuming they should return quite quickly and may not even require network connectivity if the information is cashed on device (the WWDC video talks about device caching of the age info). In my testing, I've found that the API calls return very quickly (< 1 second), but that may simply be because it's in a test environment. In any case, my intention is to default to giving the user full access to the app, and I'll only override that if the methods (called in an async task) suggest the user is not allowed access. However, I do not know if this is appropriate and I am open to other suggestions. Relatedly, what should we do if the methods throw an error? Should we assume the user is a child and restrict access? My current plan is to treat the two API calls differently: If isEligibleForAgeFeatures throws, I will assume that the user is not eligible and therefore has full access. If requestAgeRange throws, I will assume the user is a child and restrict access. My logic is that if I cannot determine eligibility, then I should err on the side of the user not being eligible, since the vast majority of users (for the foreseeable future and around the world) will not be eligible. However, if the user is eligible for age features, then we should err on the side of caution, and assume child until proven otherwise. Having thought about all this for another day, my new plan is to drop the App Store age rating check (for now), and use age 18 for the age gate parameter. My primary reason for this is because I don't have much time to properly investigate the PermissionKit stuff and the significant changes stuff before the end of the year (and this stuff seems rather more complicated). So, my first priority is to make sure that the app is blocked for all under-18s (who are subject to the law) until I have a clearer understanding of those issues. However, I would be very keen to hear how other people are handling that stuff.
Dec ’25
Reply to Correctly initializing observable classes in modern SwiftUI
Thanks for getting back to me. I'm thinking specifically about the instantiation of a manager-style class that is global to the app and placed into the environment at launch time. The documentation that you linked to gives this specific example: @main struct BookReaderApp: App { @State private var library = Library() var body: some Scene { WindowGroup { LibraryView() .environment(library) } } } In the example, the Library class is created as a @State var and put into the environment. My first question (which is more theoretical) is: Why does the variable need to be annotated with @State. My (perhaps naive) assumption was that the App struct is special and is initialized only once, so it's not immediately obvious to me why its properties need to be held in state. Indeed, making it a let constant without the @State annotation seems to result in the same behavior. My second, more practical question is: What is the correct way to manually instantiate the Library (rather than doing so in the property declarations). There are various reasons why you might want to do this, but for the sake of example, I'm thinking of something along these lines: @main struct BookReaderApp: App { @State private var library init() { let thing = Thing() self.library = Library(thing: thing) } var body: some Scene { WindowGroup { LibraryView() .environment(library) } } } Note that in this case, Library is not created in the property declarations, but in the init. Searching around on GitHub, I find some examples where people do: self.library = Library(thing: thing) and other cases where people do: self._library = State(initialValue: Library(thing: thing)) I'm trying to understand what the difference is between these two (if anything). Thanks for any light you can shed on these (possibly newbie) questions!
Topic: UI Frameworks SubTopic: SwiftUI Tags:
Nov ’25
Reply to Some issues and questions regarding the use of the BGContinuedProcessingTask API
The engineering team agrees that differentiating between "system expiration" and "user cancellation" is a significant oversight in the current API. I can't comment on future plans/scheduling, but this is something I expect the API to address. Thanks, this is good to know! if you're seeing UI when there are NOT tasks from other apps active, then that's worth filing a bug on. This is definitely happening to me. The BGContinuedProcessingTask UI (notification banner thing) shows up immediately every time I start such a task, even if there are no other apps running a background task (iPhone 12 mini, running iOS 26.1). I didn't realize that this was not the intended behavior. It's kinda hard to file a bug because I don't think there's any specific documentation that states that this is not how the API is supposed to work.
Nov ’25
Reply to BGContinuedProcessingTask expiring unpredictably
Thanks for the advice! Informally, I've generally found that a progress bar needs to update every ~0.1s (10x/sec) to ~1-2s in order to look "right". I guess this depends on the magnitude of the task. If the task takes an hour, then updating every 1s would mean that each update moves the progress bar by a fraction of a pixel. My current setup moves the progress bar about 4px at a time, which looks pretty smooth to me. You can use the progress delegate that provides "raw" rate data; however, the easier option is to use the URLSessionTask.Progress object. With single downloads, you can use that Progress object directly, while more complicated scenarios can use child progress and the flow described here. Thanks for the pointers. I looked into these options but, in the end, it feels like trying to tie the progress to bytes downloaded (rather than number of files completed) is going to open up other issues, notably: If a file fails to download for some reason (and has to be restarted), I will need to subtract out the bytes downloaded from the progress or, in some sense, track each file's progress. Not impossible, but feels fraught with state management bug potential. My task involves downloading and processing files. In some cases, the processing is very small relative to the download time, but in other cases, the processing can take longer than the download, so linking progress only to the download part is not ideal. That's why I originally chose to link progress to file completion. The modifications I'd need to make to my code (passing download progress from the URLSession delegate through an AsyncStream back to the calling task) will result in more complexity that will be hard to test, and background URLSessions are already really hard to work with. Overall, my sense is that BGContinuedProcessingTask is better suited to tasks that are primarily CPU-bound where the rate of progress and time-to-completion is fairly predictable a-priori. For tasks that are more network-bound and where progress is heavily dependent on the volume of user data, it becomes hard to guarantee a particular rate of progress. Since BGContinuedProcessingTask seems to be quite trigger happy and doesn't distinguish between user cancellation and system cancellation, I don't think it's going to be the right option for me. Which is a shame because I was quite excited about it when I saw it at WWDC. Anyway, thanks for the advice, Kevin, and for being very responsive (also to other threads on this forum). It's much appreciated!
Nov ’25