Natural Language

RSS for tag

Analyze natural language text and deduce its language-specific metadata using Natural Language.

Posts under Natural Language tag

5 Posts

Post

Replies

Boosts

Views

Activity

Autocorrection and predictive text support for additional Cyrillic languages
Hello Apple Keyboard / Internationalization team, I would like to ask about autocorrection and predictive text support for additional Cyrillic-based languages, especially Kazakh, Kyrgyz, Chuvash, and Ingush. These languages use Cyrillic scripts with their own letters, spelling rules, and word-frequency patterns. When users type in these languages, Russian-based autocorrection or missing language-specific correction can produce incorrect suggestions or replacements. My questions are: Are there plans to expand autocorrection and predictive text support for more Cyrillic-based languages? Is there a recommended way for developers or language communities to provide dictionaries, word-frequency lists, corpora, or other linguistic data to help improve autocorrection? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another Apple channel? I have corpus-based frequency data and language resources for multiple Cyrillic-based languages and would be happy to share them if useful. Thank you. Ali Kuzhuget
1
0
80
5d
Cyrillic keyboard long-press support for additional languages
Hello Apple Keyboard / Internationalization team, In the current beta, I noticed new keyboard support for Tuvan and Sakha. Thank you — this is very important for Cyrillic-based languages and their communities. I also noticed improvements to the Russian keyboard long-press options, but some Cyrillic letters used by other languages still seem to be missing. For example, Ossetian uses Ӕ ӕ, and this character does not appear as a long-press option. My questions are: Are there plans to expand the Russian keyboard long-press mappings to cover more Cyrillic-based languages? Is there a recommended way for language communities or developers to provide corpus/frequency data and character mappings to help improve keyboard support? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another channel? I have corpus-based frequency data and long-press mapping data for many Cyrillic-based languages and would be happy to share it if useful. Thank you. Ali Kuzhuget
1
0
98
5d
Autocorrection and predictive text support for additional Cyrillic languages
Hello Apple Keyboard / Internationalization team, I would like to ask about autocorrection and predictive text support for additional Cyrillic-based languages, especially Kazakh, Kyrgyz, Chuvash, and Ingush. These languages use Cyrillic scripts with their own letters, spelling rules, and word-frequency patterns. When users type in these languages, Russian-based autocorrection or missing language-specific correction can produce incorrect suggestions or replacements. My questions are: Are there plans to expand autocorrection and predictive text support for more Cyrillic-based languages? Is there a recommended way for developers or language communities to provide dictionaries, word-frequency lists, corpora, or other linguistic data to help improve autocorrection? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another Apple channel? I have corpus-based frequency data and language resources for multiple Cyrillic-based languages and would be happy to share them if useful. Thank you. Ali Kuzhuget
1
3
64
6d
Problem running NLContextualEmbeddingModel in simulator
Environment MacOC 26 Xcode Version 26.0 beta 7 (17A5305k) simulator: iPhone 16 pro iOS: iOS 26 Problem NLContextualEmbedding.load() fails with the following error In simulator Failed to load embedding from MIL representation: filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] Failed to load embedding model 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' assetRequestFailed(Optional(Error Domain=NLNaturalLanguageErrorDomain Code=7 "Embedding model requires compilation" UserInfo={NSLocalizedDescription=Embedding model requires compilation})) in #Playground I'm new to this embedding model. Not sure if it's caused by my code or environment. Code snippet import Foundation import NaturalLanguage import Playgrounds #Playground { // Prefer initializing by script for broader coverage; returns NLContextualEmbedding? guard let embeddingModel = NLContextualEmbedding(script: .latin) else { print("Failed to create NLContextualEmbedding") return } print(embeddingModel.hasAvailableAssets) do { try embeddingModel.load() print("Model loaded") } catch { print("Failed to load model: \(error)") } }
3
3
3.1k
May ’26
Detection of Unavailable Characters (Tofu Box) in a String
Hi, I wanted to know what is the best way to detect whether a part of string has an unavailable character, '□' (tofu box or last resort character). So far it seems to be that we will have to parse all the strings and individually check for each character and whether or not it is a part of the Unicode Scalar. And since we are a business application that deals with a lot of data as strings, this will be rather performance heavy. So wanted to know if there were any other better or more efficient ways to go about this?
1
0
257
Sep ’25
Autocorrection and predictive text support for additional Cyrillic languages
Hello Apple Keyboard / Internationalization team, I would like to ask about autocorrection and predictive text support for additional Cyrillic-based languages, especially Kazakh, Kyrgyz, Chuvash, and Ingush. These languages use Cyrillic scripts with their own letters, spelling rules, and word-frequency patterns. When users type in these languages, Russian-based autocorrection or missing language-specific correction can produce incorrect suggestions or replacements. My questions are: Are there plans to expand autocorrection and predictive text support for more Cyrillic-based languages? Is there a recommended way for developers or language communities to provide dictionaries, word-frequency lists, corpora, or other linguistic data to help improve autocorrection? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another Apple channel? I have corpus-based frequency data and language resources for multiple Cyrillic-based languages and would be happy to share them if useful. Thank you. Ali Kuzhuget
Replies
1
Boosts
0
Views
80
Activity
5d
Cyrillic keyboard long-press support for additional languages
Hello Apple Keyboard / Internationalization team, In the current beta, I noticed new keyboard support for Tuvan and Sakha. Thank you — this is very important for Cyrillic-based languages and their communities. I also noticed improvements to the Russian keyboard long-press options, but some Cyrillic letters used by other languages still seem to be missing. For example, Ossetian uses Ӕ ӕ, and this character does not appear as a long-press option. My questions are: Are there plans to expand the Russian keyboard long-press mappings to cover more Cyrillic-based languages? Is there a recommended way for language communities or developers to provide corpus/frequency data and character mappings to help improve keyboard support? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another channel? I have corpus-based frequency data and long-press mapping data for many Cyrillic-based languages and would be happy to share it if useful. Thank you. Ali Kuzhuget
Replies
1
Boosts
0
Views
98
Activity
5d
Autocorrection and predictive text support for additional Cyrillic languages
Hello Apple Keyboard / Internationalization team, I would like to ask about autocorrection and predictive text support for additional Cyrillic-based languages, especially Kazakh, Kyrgyz, Chuvash, and Ingush. These languages use Cyrillic scripts with their own letters, spelling rules, and word-frequency patterns. When users type in these languages, Russian-based autocorrection or missing language-specific correction can produce incorrect suggestions or replacements. My questions are: Are there plans to expand autocorrection and predictive text support for more Cyrillic-based languages? Is there a recommended way for developers or language communities to provide dictionaries, word-frequency lists, corpora, or other linguistic data to help improve autocorrection? Should this type of request be submitted through Feedback Assistant, Developer Forums, or another Apple channel? I have corpus-based frequency data and language resources for multiple Cyrillic-based languages and would be happy to share them if useful. Thank you. Ali Kuzhuget
Replies
1
Boosts
3
Views
64
Activity
6d
Problem running NLContextualEmbeddingModel in simulator
Environment MacOC 26 Xcode Version 26.0 beta 7 (17A5305k) simulator: iPhone 16 pro iOS: iOS 26 Problem NLContextualEmbedding.load() fails with the following error In simulator Failed to load embedding from MIL representation: filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] Failed to load embedding model 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' assetRequestFailed(Optional(Error Domain=NLNaturalLanguageErrorDomain Code=7 "Embedding model requires compilation" UserInfo={NSLocalizedDescription=Embedding model requires compilation})) in #Playground I'm new to this embedding model. Not sure if it's caused by my code or environment. Code snippet import Foundation import NaturalLanguage import Playgrounds #Playground { // Prefer initializing by script for broader coverage; returns NLContextualEmbedding? guard let embeddingModel = NLContextualEmbedding(script: .latin) else { print("Failed to create NLContextualEmbedding") return } print(embeddingModel.hasAvailableAssets) do { try embeddingModel.load() print("Model loaded") } catch { print("Failed to load model: \(error)") } }
Replies
3
Boosts
3
Views
3.1k
Activity
May ’26
Detection of Unavailable Characters (Tofu Box) in a String
Hi, I wanted to know what is the best way to detect whether a part of string has an unavailable character, '□' (tofu box or last resort character). So far it seems to be that we will have to parse all the strings and individually check for each character and whether or not it is a part of the Unicode Scalar. And since we are a business application that deals with a lot of data as strings, this will be rather performance heavy. So wanted to know if there were any other better or more efficient ways to go about this?
Replies
1
Boosts
0
Views
257
Activity
Sep ’25