Post

Replies

Boosts

Views

Activity

HVF FlatPartCache Inefficiency Causing Chinese Text Rendering Regression on iOS 18+
Summary On iOS 18 and later, Chinese text rendering shows a noticeable performance regression related to the HVF (Hierarchical Variable Font) pipeline. Environment iOS Version: iOS 18+ Framework: libhvf.dylib (Hierarchical Variable Font) Affected Font: PingFangUI.ttc (private system font, automatically used for Chinese text) Related Frameworks: CoreText, CoreGraphics, FontParser Devices: All iOS devices (more noticeable on older hardware) Background iOS 18 Change: PingFang.ttc was removed from /System/Library/Fonts/ Private PingFangUI.ttc was added (inaccessible via normal font APIs) System automatically uses PingFangUI.ttc for all Chinese text rendering PingFangUI.ttc contains HVF tables → utilizes libhvf.dylib HVF Architecture: HVF (Hierarchical Variable Font) organizes glyphs as tree structures Each glyph = Composite → multiple Parts → nested hierarchy Rendering a single character requires traversing this tree Key Observation A single Chinese glyph typically triggers ~20 calls to HVF::LoaderHVGL::loadPartAtIndex. Cache invalidation is triggered via IncrementRenderCount after every 18 glyphs: __ZNK27THierVariationsDataForkFont20IncrementRenderCountEv: ldr w8, [x0, #0x12c] add w8, w8, #0x1 str w8, [x0, #0x12c] cmp w8, #0x12 b.lo return ldr x0, [x0, #0x120] bl HVF_clear_part_cache str wzr, [x19, #0x12c] return: ret This causes the cache to be cleared before a typical sentence finishes rendering. Complete Call Stack (Rendering Hot Path) #0-1 HVF::LoaderHVGL::loadPartAtIndex #2 HVF::FlatPartCache::partAtIndex #3 HVF::PartTransformRenderer::renderComposite #4 HVF::PartTransformRenderer::render #5 HVF::PartTransformRenderer::renderToContext #6 _HVF_render_current_part #7 THierVariationsFontHandler::GetOutlinePath #8 TFontHandler::CopyGlyphPath #9 THierVariationsFontHandler::CopyGlyphPath #10 TFPFont::CopyGlyphPath #11-12 TFPFont::CopyGlyphPath / _FPFontCopyGlyphPath #13 _CGFontCreateGlyphPath #14 _CGGlyphBuilderLockBitmaps #15 _render_glyphs #16 _draw_glyph_bitmaps #17 _ripc_DrawGlyphs #18 CG::DisplayList::executeEntries #19 _CGDisplayListDrawInContextDelegate #20 _CABackingStoreUpdate_ #21-22 CALayer display/layout #23-24 CA::Transaction::commit #25-30 UIApplicationMain / RunLoop HVF::LoaderHVGL::loadPartAtIndex is consistently observed as a hot function in Instruments and in production. Cache Clear Call Stack #0 HVF::FlatPartCache::clear #1 HVF_clear_part_cache #2 THierVariationsDataForkFont::IncrementRenderCount #3 THierVariationsFontHandler::GetOutlinePath #4 TFontHandler::CopyGlyphPath #5 FPFontCopyGlyphPath #6 CGFontCreateGlyphPath #7 _render_glyphs #8 _draw_glyph_bitmaps #9 _ripc_DrawGlyphs This shows that cache clearing occurs within the glyph rendering path. Impact For a typical Chinese sentence (~20 characters): Each glyph requires multiple part loads (~20 per glyph) Cache is cleared before rendering completes Previously loaded parts cannot be reused Observed effects: Increased loadPartAtIndex invocation count Low cache hit rate Increased CPU usage in glyph rendering Main-thread blocking during Core Animation commit Regression iOS 17 and earlier: Rendering is smooth under similar workloads. iOS 18+: Increased rendering cost and visible frame drops. The issue is more pronounced on older devices such as iPhone XS and iPhone 11. Reproduction Render a Chinese text string longer than 18 characters, for example: 刷新测试中文文本用于验证渲染性能问题需要超过十八个字 Observe: Repeated loadPartAtIndex calls Frequent cache clearing Request It would be helpful to review the cache eviction strategy for HVF, particularly for complex scripts such as Chinese. Potential considerations: Adjusting or scaling the cache threshold Avoiding full cache clears during continuous rendering Improving reuse of parts across glyphs within the same rendering batch
1
0
242
4w
Memory Zeroing Issue After iOS 18 Update
After iOS 18, some new categories of crash exceptions appeared online, such as those related to the sqlite pcache1 module, those related to the photo album PHAsset, those related to various objc_release crashes, etc. These crash scenarios and stacks are all different, but they all share a common feature, that is, they all crash due to accessing NULL or NULL addresses with a certain offset. According to the analysis, the direct cause is that a certain pointer, which previously pointed to valid memory content, has now become pointing to 0 incorrectly and mysteriously. We tried various methods to eliminate issues such as multi-threading problems. To determine the cause of the problem, we have a simulated malloc guard detection in production. The principle is very simple: Create some private NSString objects with random lengths, but ensure that they exceed the size of one memory physical page. Set the first page of memory for these objects to read-only (aligning the object address with the memory page). After a random period of time (3s - 10s), reset the memory of these objects to read/write and immediately release these objects. Then repeat the operation starting from step 1. In this way, if an abnormal write operation is performed on the memory of these objects, it will trigger a read-only exception crash and report the exception stack. Surprisingly, after the malloc guard detection was implemented, some crashes occurred online. However, the crashes were not caused by any abnormal rewriting of read-only memory. Instead, they occurred when the NSString objects were released as mentioned earlier, and the pointers pointed to contents of 0. Therefore, we have added object memory content printing after object generation, before and after setting to read-only, and before and after reverting to read-write. The result was once again unexpected. The log showed that the isa pointer of the object became 0 after setting to read-only and before re-setting to read-write. So why did it become 0 during read-only mode, but no crash occurred due to the read-only status? We have revised the plan again. We have added a test group, in which after the object is created, we will mlock the memory of the object, and then munlock it again before the object is released. As a result, the test analysis showed that the test group did not experience a crash, while the crashes occurred entirely in the control group. In this way, we can prove that the problem occurs at the system level and is related to the virtual memory function of the operating system. It is possible that inactive memory pages are compressed and then cleared to zero, and subsequent decompression fails. This results in the accidental zeroing out of the memory data. As mentioned at the beginning, althougth this issue is a very rare occurrence, but it exists in various scenarios. definitely It appeared after iOS 18. We hope that the authorities will pay attention to this issue and fix it in future versions.
4
0
252
Jul ’25
HVF FlatPartCache Inefficiency Causing Chinese Text Rendering Regression on iOS 18+
Summary On iOS 18 and later, Chinese text rendering shows a noticeable performance regression related to the HVF (Hierarchical Variable Font) pipeline. Environment iOS Version: iOS 18+ Framework: libhvf.dylib (Hierarchical Variable Font) Affected Font: PingFangUI.ttc (private system font, automatically used for Chinese text) Related Frameworks: CoreText, CoreGraphics, FontParser Devices: All iOS devices (more noticeable on older hardware) Background iOS 18 Change: PingFang.ttc was removed from /System/Library/Fonts/ Private PingFangUI.ttc was added (inaccessible via normal font APIs) System automatically uses PingFangUI.ttc for all Chinese text rendering PingFangUI.ttc contains HVF tables → utilizes libhvf.dylib HVF Architecture: HVF (Hierarchical Variable Font) organizes glyphs as tree structures Each glyph = Composite → multiple Parts → nested hierarchy Rendering a single character requires traversing this tree Key Observation A single Chinese glyph typically triggers ~20 calls to HVF::LoaderHVGL::loadPartAtIndex. Cache invalidation is triggered via IncrementRenderCount after every 18 glyphs: __ZNK27THierVariationsDataForkFont20IncrementRenderCountEv: ldr w8, [x0, #0x12c] add w8, w8, #0x1 str w8, [x0, #0x12c] cmp w8, #0x12 b.lo return ldr x0, [x0, #0x120] bl HVF_clear_part_cache str wzr, [x19, #0x12c] return: ret This causes the cache to be cleared before a typical sentence finishes rendering. Complete Call Stack (Rendering Hot Path) #0-1 HVF::LoaderHVGL::loadPartAtIndex #2 HVF::FlatPartCache::partAtIndex #3 HVF::PartTransformRenderer::renderComposite #4 HVF::PartTransformRenderer::render #5 HVF::PartTransformRenderer::renderToContext #6 _HVF_render_current_part #7 THierVariationsFontHandler::GetOutlinePath #8 TFontHandler::CopyGlyphPath #9 THierVariationsFontHandler::CopyGlyphPath #10 TFPFont::CopyGlyphPath #11-12 TFPFont::CopyGlyphPath / _FPFontCopyGlyphPath #13 _CGFontCreateGlyphPath #14 _CGGlyphBuilderLockBitmaps #15 _render_glyphs #16 _draw_glyph_bitmaps #17 _ripc_DrawGlyphs #18 CG::DisplayList::executeEntries #19 _CGDisplayListDrawInContextDelegate #20 _CABackingStoreUpdate_ #21-22 CALayer display/layout #23-24 CA::Transaction::commit #25-30 UIApplicationMain / RunLoop HVF::LoaderHVGL::loadPartAtIndex is consistently observed as a hot function in Instruments and in production. Cache Clear Call Stack #0 HVF::FlatPartCache::clear #1 HVF_clear_part_cache #2 THierVariationsDataForkFont::IncrementRenderCount #3 THierVariationsFontHandler::GetOutlinePath #4 TFontHandler::CopyGlyphPath #5 FPFontCopyGlyphPath #6 CGFontCreateGlyphPath #7 _render_glyphs #8 _draw_glyph_bitmaps #9 _ripc_DrawGlyphs This shows that cache clearing occurs within the glyph rendering path. Impact For a typical Chinese sentence (~20 characters): Each glyph requires multiple part loads (~20 per glyph) Cache is cleared before rendering completes Previously loaded parts cannot be reused Observed effects: Increased loadPartAtIndex invocation count Low cache hit rate Increased CPU usage in glyph rendering Main-thread blocking during Core Animation commit Regression iOS 17 and earlier: Rendering is smooth under similar workloads. iOS 18+: Increased rendering cost and visible frame drops. The issue is more pronounced on older devices such as iPhone XS and iPhone 11. Reproduction Render a Chinese text string longer than 18 characters, for example: 刷新测试中文文本用于验证渲染性能问题需要超过十八个字 Observe: Repeated loadPartAtIndex calls Frequent cache clearing Request It would be helpful to review the cache eviction strategy for HVF, particularly for complex scripts such as Chinese. Potential considerations: Adjusting or scaling the cache threshold Avoiding full cache clears during continuous rendering Improving reuse of parts across glyphs within the same rendering batch
Replies
1
Boosts
0
Views
242
Activity
4w
Memory Zeroing Issue After iOS 18 Update
After iOS 18, some new categories of crash exceptions appeared online, such as those related to the sqlite pcache1 module, those related to the photo album PHAsset, those related to various objc_release crashes, etc. These crash scenarios and stacks are all different, but they all share a common feature, that is, they all crash due to accessing NULL or NULL addresses with a certain offset. According to the analysis, the direct cause is that a certain pointer, which previously pointed to valid memory content, has now become pointing to 0 incorrectly and mysteriously. We tried various methods to eliminate issues such as multi-threading problems. To determine the cause of the problem, we have a simulated malloc guard detection in production. The principle is very simple: Create some private NSString objects with random lengths, but ensure that they exceed the size of one memory physical page. Set the first page of memory for these objects to read-only (aligning the object address with the memory page). After a random period of time (3s - 10s), reset the memory of these objects to read/write and immediately release these objects. Then repeat the operation starting from step 1. In this way, if an abnormal write operation is performed on the memory of these objects, it will trigger a read-only exception crash and report the exception stack. Surprisingly, after the malloc guard detection was implemented, some crashes occurred online. However, the crashes were not caused by any abnormal rewriting of read-only memory. Instead, they occurred when the NSString objects were released as mentioned earlier, and the pointers pointed to contents of 0. Therefore, we have added object memory content printing after object generation, before and after setting to read-only, and before and after reverting to read-write. The result was once again unexpected. The log showed that the isa pointer of the object became 0 after setting to read-only and before re-setting to read-write. So why did it become 0 during read-only mode, but no crash occurred due to the read-only status? We have revised the plan again. We have added a test group, in which after the object is created, we will mlock the memory of the object, and then munlock it again before the object is released. As a result, the test analysis showed that the test group did not experience a crash, while the crashes occurred entirely in the control group. In this way, we can prove that the problem occurs at the system level and is related to the virtual memory function of the operating system. It is possible that inactive memory pages are compressed and then cleared to zero, and subsequent decompression fails. This results in the accidental zeroing out of the memory data. As mentioned at the beginning, althougth this issue is a very rare occurrence, but it exists in various scenarios. definitely It appeared after iOS 18. We hope that the authorities will pay attention to this issue and fix it in future versions.
Replies
4
Boosts
0
Views
252
Activity
Jul ’25