Java gcexcel: Optimizing Hyperlink Insertion and Save Time in Large Excel Files

Posted by: nazar.onishchenko on 12 August 2024, 10:04 am EST

    • Post Options:
    • Link

    Posted 12 August 2024, 10:04 am EST

    Hi,

    I need to generate a large Excel file with 20 columns and 100,000 rows, each containing various hyperlinks. Using the usual methods—range.setFormula(formula) or range.getHyperlinks().add(range, hyperlink)—is very slow, often taking over 60 seconds due to the lack of batch processing; each cell is updated individually. To address this, I directly set hyperlinks in the worksheet JSON, but this approach isn’t ideal.

    How can I improve the performance for setting hyperlinks? Additionally, I’m concerned that the workbook.save() operation takes more than 15 seconds for a 10MB file. What strategies can help optimize save time?

    Thank you!

  • Posted 13 August 2024, 8:31 am EST

    Hi Nazar,

    Apologies for the delayed response.

    There are various factors that can affect performance when editing and saving a workbook, especially when you have a lot of cells with formulas and hyperlinks. Given this, it may take some time. To determine if there are any performance issues, we request that you provide a sample project that demonstrates your implementation and the timings you mentioned. You can use any dummy data or hyperlinks for the sample.

    Additionally, we recommend testing with the latest version 7.1.5 of DsExcel to see if there are any performance improvements on your end.

    Thank you for your patience and cooperation.

    Kind Regards,

    Kartik

  • Posted 15 August 2024, 7:41 am EST - Updated 15 August 2024, 8:31 am EST

    Hi, here is simple test with latest version

    Version: com.grapecity.documents:gcexcel:7.1.5

    Config: MacBook Pro M1 Pro 32GB

    public class ExcelDemo {
    
        @Test
        public void testPerformance() throws IOException {
            var workbook = new Workbook();
            var worksheet = workbook.getWorksheets().add();
            worksheet.setName("Data");
    
            var style = workbook.getStyles().add("myCustomStyle");
            style.getFont().setSize(16);
            style.getFont().setColor(Color.GetLightGray());
    
            var dataStartTime = System.currentTimeMillis();
            var range = worksheet.getRange(0, 0, 50_000, 25);
            range.setValue(data());
            range.setStyle(style);
    
            var condition = (IFormatCondition) range.getFormatConditions().add(FormatConditionType.Expression, FormatConditionOperator.None, "=ISEVEN(ROW())", null);
            condition.getInterior().setColor(Color.FromArgb(238, 243, 248));
    
            System.out.println("Data Time (ms): " + (System.currentTimeMillis() - dataStartTime));
    
            var hyperlinkStartTime = System.currentTimeMillis();
            for (int i = 0; i < 50_000; i++) {
                worksheet.getHyperlinks().add(worksheet.getRange(i, 24), "https://google.com/" + i, null, null, "Google Link");
            }
            System.out.println("Hyperlink Time (ms): " + (System.currentTimeMillis() - hyperlinkStartTime));
    
            Files.createFile(Path.of("testFile.xlsx"));
    
            var saveStartTime = System.currentTimeMillis();
            workbook.save("testFile.xlsx", SaveFileFormat.Xlsx);
            System.out.println("Save Time (ms): " + (System.currentTimeMillis() - saveStartTime));
        }
    
        private Object[][] data() {
            Object[][] rows = new Object[50_000][25];
    
            Object[] columns = new Object[]{
                "column1",
                "column2",
                "column3",
                "column4",
                "column5",
                "column6",
                "column7",
                "column8",
                "column9",
                "column10",
                "column11",
                "column12",
                "column13",
                "column14",
                "column15",
                "column16",
                "column17",
                "column18",
                "column19",
                "column20",
                "column21",
                "column22",
                "column23",
                "column24",
                "column25"
            };
    
            Arrays.fill(rows, columns);
    
            return rows;
        }
    }

    Results (several attempts):

    Data Time (ms): 400 - 550

    Hyperlink Time (ms): 7000 - 15500

    Save Time (ms): 2000 - 3000)

  • Posted 15 August 2024, 9:34 am EST

    I also found that saving time increased when different data was used for the rows.

    public class ExcelDemo {
    
        @Test
        public void testPerformance() throws IOException {
            var workbook = new Workbook();
            var worksheet = workbook.getWorksheets().add();
            worksheet.setName("Data");
    
            var range = worksheet.getRange(0, 0, 50_000, 25);
            range.setValue(data());
    
            Files.createFile(Path.of("testFile.xlsx"));
    
            var saveStartTime = System.currentTimeMillis();
            workbook.save("testFile.xlsx", SaveFileFormat.Xlsx);
            System.out.println("Save Time (ms): " + (System.currentTimeMillis() - saveStartTime));
        }
    
        private Object[][] data() {
            Object[][] rows = new Object[50_000][25];
    
            for (Object[] row : rows) {
                for (int i = 0; i < row.length; i++) {
                    row[i] = generateRandomString();
                }
            }
    
            return rows;
        }
    
        private String generateRandomString() {
            return RandomStringUtils.random(10, true, true);
        }
    }

    Save Time (ms): 8609

  • Posted 16 August 2024, 8:02 am EST

    Hi Nazar,

    Apologies for the delayed response.

    Thank you for sharing the code snippets. We were able to observe the behavior you mentioned. We’ve reported this to the development team and will update you as soon as we receive their insights.

    [Internal Tracking ID: DOCXLS-11074]

    Just for your information, Excel has a limit of 65,530 hyperlinks per worksheet, as mentioned here. It seems your implementation exceeds this limit. We recommend using the HYPERLINK formula in the cells, which performs much faster than adding hyperlinks directly.

    Please see the updated code snippet below for handling hyperlinks.

     var hyperlinkStartTime = System.currentTimeMillis();
            
    for (int i = 0; i < 100_000; i++) {
           worksheet.getRange(i, 19).setFormula("=HYPERLINK(\"https://google.com/\", \"Google Link\")");
    }
            
    System.out.println("Hyperlink Time (ms): " + (System.currentTimeMillis() - hyperlinkStartTime));

    Best Regards,

    Kartik

  • Posted 16 August 2024, 12:02 pm EST

    Hi, thanks for the answer. I tried using a different formula for each cell, and it does take less time, but only when I disable workbook calculations with “workbook.setEnableCalculation(false)” (Disabling this feature is fine with me, as I don’t need it.);. However, it still takes a significant amount of time compared to setting an array of values.

    for (int i = 0; i < 50_000; i++) {
            worksheet.getRange(i, 25).setFormula("=HYPERLINK(\"https://google.com/" + i + "\", \"Google Link\")");
    }

    Formula Time (ms): 2992

  • Posted 19 August 2024, 3:13 am EST

    Hi,

    Apologize for the inconvenience. We have shared your concerns to the development team. Will get back to you once we have any updates from them.

    Regards,

    Nitin

  • Posted 29 August 2024, 9:11 am EST

    Hi, I hope you’re doing well! I was wondering if there are any updates

  • Posted 30 August 2024, 5:56 am EST

    Hi,

    The development team has made enhancements for adding hyperlinks in the latest 7.2.1 release. Using the code below, adding 100,000 hyperlinks takes around 400-500ms on both our end and the development team’s end.

    for (int i = 0; i < 100_000; i++) {
         worksheet.getHyperlinks().add(worksheet.getRange(i, 19), "https://google.com/" + i, null, null, "Google Link" + i);
    }

    We request that you test the performance on your end using the latest version and let us know if it meets your requirements. You can download the latest build from https://github.com/GrapeCity/DsExcel-Java/tree/master/dsexcel

    Best Regards,

    Kartik

  • Posted 30 August 2024, 7:17 am EST

    Hi,

    Thanks! The hyperlink works really fast—1504 ms for 100,000 rows. However, it’s important to note that there is still a limit of 65,530 hyperlinks per file.

    I also tried setting a formula, and it resulted in similar performance as before, taking 3311 ms for 50,000 rows. However, the saving process still takes considerable time—10473 ms for 50,000 rows.

  • Posted 2 September 2024, 3:42 am EST

    Hi,

    Saving the .xlsx file with 100,000 rows takes around 6000-7000ms on our end when running the earlier sample project. We’re getting in touch with the development team to see if this performance can be improved. We’ll keep you updated as soon as we have further information.

    Best Regards,

    Kartik

  • Posted 2 September 2024, 4:18 am EST

    Hi, thanks for your response. Looking forward to the updates.

  • Posted 18 September 2024, 3:59 am EST

    Hi,

    Apologies for the delay in response.

    After further research, the development team found that saving the .xlsx file does not have any significant performance issues in DsExcel, and it is difficult to achieve a substantial performance improvement. To make a better comparison, they used MS Excel to perform a similar hyperlinking operation and found that it took a significantly long time.

    Thank you for your patience and understanding.

    Best Regards,

    Kartik

  • Posted 20 September 2024, 8:45 am EST

    Hello, I appreciate you taking the time to investigate!

Need extra support?

Upgrade your support plan and get personal unlimited phone support with our customer engagement team

Learn More

Forum Channels